1 2 3 4 5 




co 



o 

CD 



I 



X5 
<— 
CO 



o 



O 
en 



i 



■o 

CO 



1 



• m 



Ow 



I 



1 



00 



M 
O 

a 



— m 



i 



i 



-Ooo 



i 



cr 



i p 



cn 



o 

CO 



■J 



o e 50 70 90 

IAATAGTCGTTTAACTAGTATTTTTTAATA_CGAAAAATTACTTAATTAAAT^ 

~JT~ -io 1 : 

-no, 130 150 170 

rAtcc|gpcATirJ^ 

^ ^ ^ 

7QA 310 33® 350 

CCCCGATGATTGATTTTTCTGTAGTGTCACCTAACGGCGTGGCAGCCTTGGTTGAAAATCAATATATTGTGAG 

PMlDFSVVSRNGVAALvcNve 

370 390 Tr » /^r^TrrTTTTArTTATAAGATTGTAAAACGAAATAACTACAAA 



570 ^ 590 610 630 

M ' "n" 1 ' ^ ' y" ' 5 ""5" R GA ^ CA K AA Y AT p CA E T " R ~ I G S G R Q F W R H D Q 



550 jatccAGAACGTGTTCGTATCGGCTCTGGACGGCAGTTTTGGCGAAATGATCAA 



690 710 



«TAT«T«T $ TT«TTTTG^ 

A 7 (x 490 510 530 

AAGATAATTTACATCCTTATGAGGACGATTACC^ 

ATATGAATGGCAGTACTTATTCAGATAGAACAAAA 
MNGSTYSORTK' 

acmacccoaccaact?^ 

770 79® 8i ® 

TGGGAGGWATGTTCGTAAAGCGGGAGAATATGGTCCATTACCGATTGCAG^ 

j ATGCT GAAAAACAAAAAT^GTTAATTAATGGGATATTV^G GG AAG G ^^^^ CTTT'PGA AG G CAAAG AAAAT G G ^ J^^^* 1 ^^^^ 

vaatcttStttgatgaaattttcgm^ 

ft 1030 1050 l® 70 

gataatggtcaggggtctataactcagaaatcaggaataccatcag^ 



1150 1170 

aaggataaagttcataatcctagatatgacggacctaatatttatjctc<^cgtttaaacjatgga^ t l y f m d q 



1090 1110 -H? . ^ TTTAAAC AATGGAGAAACGCTATATTTTATGGATCAA 

K 0 K "v" H N P R Y OGPNIYSPRL 



1210 1230 1250 

AAACAAGGATCATTAATCTTCGCATCTGACATTAACCAAGGGGCGGGTGGTCTTTATTTTGAGGGTAATTTTACAGTATCTC 
KQGSLIFAS0INQGAGGLYFEGNFTV5FN 



1310 1330 135 0 

TTGGAAAGTAAATGGCG 

N ~Q T W Q G A G I H V S E N S T V T W K V N G V E H 0 R L S 



1270 1290 CACCGTTACTTGGAAAGTAAATGGCGTGGAACATGATCGACTTTCT 



AACCAAACTTGGCAAGGAGCTGGCATACATGTAAGTGAAAATAG( 



aaaattggtaaaggaaSSgcacgttcaagccaaa^ 

KlGKGTLHVQAKGENKGSISVGOGKVILty 



1410 1430 



FIGURE 6A 



1470 1498 1510 1S30 



AOOQGN KQAFS EI G L V > ^ * u 

1 1590 1610 

fTGATACCGATAAATTT^TTTCGGCTTTCGTGGTGGTCGCTTAGATCTTAACGGGCATTCAT^ 
DTDKFYFGFRuvjKLULrivjn-> 



1630 



1650 1670 



\C<jA6GGGGCAATGATTGTGAACCATAATACAACTCAAGCCGCTAATGTCA 

17S0 177® 1790 

.mtattaataaact^^ 



183 » ISM 1878 1898 



ACCTT^ATAMCCAACCACAGA^ 

-q^A 1950 1970 

AAC^TTTCAGCG^A^ 

■mm a 2030 2050 2070 

UGTgSaTCACGATTGGATC^^^ 

91 10 2130 2150 

C^CAATTGAGGGAA^G^ 

?1 QG 2210 2230 2250 

CAGATTGGACAGGATTAACGACTTGTCAMAAGTGGATTTAACCGATACAAAAGTTATTAATT^ 
.OWTGLTTCQKVDLTDl ^vin^x 

2290 2310 2330 

XTATTAATTTAACTGATAATGCAACGGCGAATGTTAAAGGTTTAGCAAAACTTAATGGCAATGTCACTTTA 

;iMLTDNATAMVKGLARLW*«n*i 

7T70 2390 24X0 2430 

JCATJAAcfAACAATGCCACCCMATAGGCAATATTCGACTTTC™^ 

2490 " 2510 
CGCACCAAATTCAGGGAGACAAAGGCACAACAGTGACGTTG 



7450 2470 2490 2510 

GTGCATTTAACGGATTCAGCTCAATTTTCTTTAAAAAACAGCCATTTT y y L 



2570 2590 2610 

gaaaatg^acttggacaatgcctagc^Ttactacattgcagaatttaacgcta^ 

ENATWTMPSDTTLQNLTLNNST L n :> a 



2630 2650 2670 



2690 



GCTAGCTCAAACAATACGCCACGTCGCCGTTCATTAGAGACGGAAACAACGCCAACATCGGCAGAACATCGTTTCAACACATTGACAGTA 
AS SNNT PRRRS LET ETTPT SA E H R r N 

2750 2770 2790 

AATGGTAAATTGAGTGGGCAAGGCACATTCCAATTTA^ 



2810 2830 2850 



2870 



GAGGGCGATTACATATTATCT GTTCGCAACACAGGCAAAGAACCCGAAACCCTTGAGCAAT^ 
EGDYILSVRNTGKEPEi LEQL l v c ^ 



CAA 



FIGURE 6B 



2890 2910 2930 2950 .2970 



CCGTTATCAGATAAGCTCAAATTTACTTTAGAAAATGACCACGTTGA 

PLSD1CLKFTLENDHV0AGALRYKLVr,nu.v.c 

7990 3010 3030 3050 

TTCCGCTTGCATAACCCAATAAAAGAGCAGGAATTGCACAATGATTTAGTAAGAGCAGAGCAAGCAGAACGAACATTAGAAGCCAAACAA 
F RL HN P I K EQE L H N DLV RA EQA E RT L.E A K. Q 

^070 3090 3110 3130 3150 

GTTGAACCGACTGCTAAAACACAAACAGGTGAGCCAAAAGTGCGGTCAAGAAGAGCAGCGAGAGCAGCGTTTCCTGATACCCTGCCTGAT 
VEPTAKTQTGEPKVRSRRAARAA FP UI lku 

317 0 3190 3210 3230 

CAAAGCCTGTTAAACGCATTAGAAGCCAAACAAGCTGAACTGACTGCTGAAACACAAAAAAGTAAGGCAAAAACAAAAAAAGTGCGGTCA 
QSLLNALEAKQAELTAETQKSKAKTKKVK5 

3258 3270 3290 3310 3330 

AAAAGAGCAGTGTTTTCTGATCCCCTGCTTGATCAAAGCCTGTTCGCATTAGAAGC^ 

KRAVFSDPLLDQSLFALEAALEVIDAF^qi 

3350 3370 3390 3410 

GAAAAAGATCGTCTAGCTCAAGAAGAAGCGGAAAAACAACGCAAACAAAAAGACTTGATCAGCCGTTATTCAAATAGTGCGTTATCAGAA 
EKDRLAQEEAEKQRKQKDLISRYSNSALSt 

3430 3450 3470 3490 3510 

TTATCTGCAACAGTAAATAGTATGCTTTCTGTTCAAGATGAATTAGATCGTCTTTTTGTAGATCAAGCACAATCTGCCGTGTGGACAAAT 
LSATVNSMLSVQDELORLFVDQAQSA VWTN 

3530 3550 3570 3590 

ATCGCACAGGATAAAAGACGCTATGATTCTGATGCGTTCCGTGCTTATCAGCAGCAGAAAACGAACTTACGTCAAATTGGGGTGCAAAAA 

IAQDKRRYDSOAF RAYQQQKTNLRQI GVQK 

3610 3630 3650 3670 3690 

GCCTTAGCTAATGGACGAATTGGGGCAGTTTTCTCGCATAGCCGTTCAGATAATACCTTTGATGAACAGGTTAAAAATCACGCGAGATTA 

ALANGRIGAVFSH SRS DNTFDEQVKN HAT 

3710 3730 3750 3770 

ACGATGATGTCGGGTTTTGCCCAATATCAATGGGGCGATTTACAATTTGGTGTAAACGTGGGAACGGGAATCAGTGCGAGTAAAATGGCT 

TMMSGFAQYQWGDLQF GVNVGTGISAS KMA 

3790 3810 3830 3850 3870 

GAAGAACAAAGCCGAAAAATTCATCGAAAAGCGATAAATTATGGCGTGAATGCAAGTTATCAGTTCCGTTTAGGGCAATTGGGCATTCAG 
EEQSRKIHRKAINYGVNASYQFRLGQLGIQ 

3890 3910 3930 - 3950 

CCTTATTTTGGAGTTAATCGCTATTTTATTGAACGTGAAAATTATCAATCTGAGGAAGTGAGAGTGAAAACGCCTAGCCTTGCATTTAAT 
PYFGVNRYFIERENYQSEEVRVKTPSLAFN 

3970 3990 4010 4030 4050 

CGCTATAATGCTGGCATTCGAGTTGATTATACATTTACTCCGACAGATAATATCAGCGTTAAGCCTTATTTCTTCGTCAATTATGTTGAT 

RYNAGIRVDYTFT PTONISVKPYF FV NYV D 

4070 4090 4110 4130 

GTTTCAAACGCTAACGTACAAACCACGGTAAATCTCACGGTGTTGCAACAACCATTTGGACGTTATTGGCAAAAAGAAGTGGGATTAAAG 

VSNANVQTTVNLTVLQQPFGRYWQKEV G L K 

4150 4170 4190 4210 42 30 

GCAGAAATTTTACATTTCCAAATTTCCGCTTTTATCTCAAAATCTCAAGGTTCACAACTCGGCAAACAGCAAAATGTGGGCGTGAAATTG 

AEILHFQISAFISKSQGSQLGKQQNVGVKL 

4250 4270 4290 4310 

GGCTATCGTTGGTAAAAATCAACATAATTTTATCGTTTATTGATAAACAAGGTGG^CAG^CAGAJ^CCAmTTTTTTATTCCAATAAT 

G Y R W ♦ 



FIGURE 6C 



Hap 

HK368IGA 

HK393IGA 

HK715IGA 

HK61IGA 

Consensus 



Hap 

HK368IGA 

HK393IG 

HK715IGA 

HK61IGA 

Consensus 



Hap 

HK368IGA 
HK393IGA 
HK715IGA 
HK61IGA 

Consensus 



Hap 

HK368IGA 

HK393IGA 

HK715IGA 

HK61IGA 

Consensus 



Hap 

HK368IGA 
HK393IGA 
HK715IGA 
HK61IGA 

Consensus 



MKKTVFRLNF 
MLNKKFKLNF 
MIJSIKKFKINF 
MLNKKEKLNF 
MLNKKFKLNF 
M F-LNF 

51 

AQNIKVYNKQ 
ATNVLVKDKN 
AINVEVRDKN 
ATNVEVRDKN 
A1NVEVRDKK 
A-N — V — K- 

101 

NVGY 

VSNGVSELHF 
VSNGVSELHF 
VSNGVSEHHF 
VSNGVSELHF 



LTACISLGIV 
IALTVAYALT 
IALTVAYALT 
IALTVAYALT 
IALTVAYALT 



SQAWAGHTYF 
PYTEAALVRD 
PYTEAALVRD 
PYTEAALVRD 
PYTEAALVRD 



GIDYQYYFDF 
DVDYQIFRDF 
DVDYQIFRDF 
DVDYQIFRDF 
DVDYQTFRDF 
— DYQ- — RDF 



GQLVGTSMTK 
NKDDGTALPN 
NRP1JGNVIPN 
NHSLGNVIPN 
NQSLGSALPN 
G 



A.PMIDFSW 
GIPMIDFSW 
GIPMIDFSW 
GIPMIDFSW 
GIPMIDFSW 
— PMIDFSW 



TDVDFGAEGN 

gsd^gnm^ng 
gni2x3nmnng 
qsusgnmsing 



NPDQHR 

NAKAHRDVSS 
NAKAHRDVSS 
NDKSHRDVSS 
NAKSHRDVSS 
N HR 



SRNG/VAALV 
DVDKRIATLI 
DVDKRIATLV 
DVDKRIATLI 
DVDKRIATLV 
A-L- 



. .FTYKIVKR 
EENRYFSVEK 
EENRYYTVEK 
EENRYFSVEK 
EENRYYTVEK 
Y~V~ 



50 

AENKGKFTVG 
AENKGKFSVG 
AENKGKFSVG 
AENKGRFSVG 
AENKGKFSVG 
AENKG-F-VG 

100 

ENQYIVSVAH 
NPQYWGVKH 
NPQYWGVKH 
NPQYWGVKH 
NPQYWGVKH 
— QY-V-V-H 

150 

NNY 

NEYPTKLNGK 
NEYPTKLNGK 
NEYPTKI24GK 
NNFPTE2WTS 
N 



151 200 
KKDNLH PYEDDYHNPR LHKFVTEAAP IDM.TSNMNG STYSDRTKYP 
TVTTEDQ.TQ KRREDYYMPR LDKFVTEVAP IEASTASSDA GTYNDQNKYP 
AVTTEDQ-AQ KRREDYYMPR LDKFVTEVAP IEASTDSSTA GTYNNKDKYP 
AVTTEDq'iXJ KRREDYYMPR LDKFVTEVAP IEASTASSDA GTYNDQNKYP 
FTTKEBQDAQ KRREDYYMPR LDKFVTEVAP IEASTANNNK GEYNNSDKYP 
DY — PR L-KFVTE-AP I T Y KYP 

201 250 

EKVRIGSGRQ F WWDQ DKGDQVAGAY 

AFVRLGSGSQ FIYKKGDNYS LIL N NH EVGG NNLKLVGDAY 

YFVRLGSGTQ FIYENGTRYE LWL G KBGQKSDAGG YNLKLVGNAY 

AFVRLGSGSQ FIYKKGDNYS LIL N NH EVGG NNLKLVGDAY 

AFVRIGSGSQ FIYKKGSRYQ LILTEKDKQG NLLRNWDVGG DNLELVGNAY 
— VR-GSG-Q F V— AY 



FIGURE 7A 



Hap 

HK368IGA 

HK393IGA 

HK715IGA 

HK61IGA. 

Consensus 



Hap 

HK368IQV 

HK393IGA. 

HK715IGA 

HK61IGA 

Consensus 



251 

HYLTAQSTTHN 
TYGIAGTPYK 
TYGIAGTPYE 
TYGIAGTPYK 
TYGIAGTPYK 
-Y— i 



QRGAGSGYSY 
VNHENNGLIG 
VNHENDGLIG 
VNHENNGLIG 
VNHENNGLIG 
-G 



UGG D 

FGNSKEEHSD 
FGNSNNEYIN 
FGNSKEEHSD 
FG4SKEEHSD 



VRKAGEYGPL 
PKGILSQDPL 
PKEILSKKPL 
PKGILSQDPL 
PKGILSQDPL 
PL 



301 

SPMFIYDAEK QKWLINGILR EGNPFEGKEN GFQLVRKSYF 

SPLFVYDREK GKWLFD3SYD FWAGYN KKSWQ 

SPIFVYDREK GKWIFLGSYD YWAGYN KKSWQ 

SPIfVYDBEK GKWLFDGSYD FWAGYN KKSWQ 

SPIJEVYDFEK GKWIFLGSYD FWAGYN KKSWQ 

SP-F-YP-EK -KWL — G 



300 

PIAGSKGDSG* 
TNYAVLGDSG 
TNYAVLGDSG 
TNYAVLGDSG 
TNYAVLGDSG 
GDSG 

* 

350 

D.EIFERDLH 
EWNIYKSQFT 
EWNIYKPEFA 
EWNIYKPEFA 
EWNIYKHEFA 



351 400 

Hap TSLYTRAGNG VYTISGNDNG QGSITQKSGI PSEIKITLAN MSIPLKEKDK 

HK368IGA KDVLNKDSAG SLIGSKTDYS WSSNGKTSTI TGGEK S LNVDIAD. . . 

HK393IGA EKIYEQYSAG SLIGSKTDYS WSSNGKTSTI TGGEK S LNVDLAD. . . 

HK715IGA KTVLDKDTAG SLTGSNTQYN WNPTGKTSVI SNGSE S LNVDLFD. . . 

HK61IGA EKIYQQYSAG SLTGSNTQYT WQATGSTSTI TQGGE. „^P LSVDLTEh_ 

Consensus G S S-I L 

401 450 

Han VHNPRYDGPN IYSPRLNNGE TLYFMDQKQG SLIFASDINQ GAGGLYFEGN 

HK368IGA .GKD KPNHGK SVTFEG. -SG TLTLNNNIDQ GAGGLFFEGD 

HK393IGA * GKD KPNHGK SVTFEG. .SG TLTLNNNIDQ GAGGLFFEGD 

HK715IGA .SSQD TDSKKNNHGK SVTLRG. .SG TLTLNNNIDQ GAGGLFFEGD 

HK61IGA. . -GKD KPNHGK SITLKG. .SG TLTLNNHIDQ GAGGLFFEGD 

Consensus N-G G -L I-Q GAGGL-FEG- 

451 500 

Hap FTVSPNSNQ. TWQGAGIHVS ENSTVTWKVN GVEHDRLSKI GKGTLHVQAK 

HK368IGA YEVKGTSDNT TWKGAGVSVA EGKTVTWKVH NPQYDRLAKI GKGTLXVEGT 

Hk393lGA YEVKGTSDNT TWKGAGVSVA EGKTVTWKVH NPQYDRLAKI GKGTLTVEGT 
HK715IGA YEVKGTSDST TWKGAGVSVA DGKTVTWKVH NPKSDRLAKI GKGTLTVEGK 
HK61IGA YEVKGTSDST TWKGAGVSVA DGKTVTWKVH NPKYDRLAKI GKGTLWEGK 

Consensus — V S TW-GAG— V TVTWKV DRL-KI GKGTL-V 



FIGURE 7B 



550 

SLr^TSVG DGKVI1EQQA DDQGNKQAFS EIGLVSGRGT VQIisDDKQFD . 

Soma SS nSgThafa svgivsgrst lvi*ddkqvd- 

i ll ^Ssg^SSSSRSgSSSSSSEEBS 

Consensus G-N-G VG LX^-vxj^-v^ «• 

551 600 
Han TDKFYPQFFG GRIDLNGHSL TEKRIQOTDE GAMIVNHNTT QAA^ITGN 

HK368IGA ™nTOEW3 GRIDINGNSL TFDHIFNIDD GABLVNH^T NASNITITCE 
S^lSa SllSSpG GRIDIHGNSL TFDHIPNIDE GARLVNHSTS WiSTVTITGD 

SisSst p^iSSis; gridangnnl tfehibnidd grblvnhnts ktstvtit^ 

SSigT ^SiSpG GRIDIlOaSL TFDHIPNIDD GARVVJ^TT NTSNITITGE 
Consensus YFGFRG GRLD-NG-L TF-I-N-D- GA-VNH 

650 

601 

sLirSrrr pynidapded npyaffrikd <^ly£nlen yttcalkkga 

SSs iSvKPLEDD NPYAIRQIKY GYQLYFNEEN RTYYALKKDA 
S^TOPOTIT PYNIDAPDED NPYAFBRIKD GGQLYUOEN YTYYALRKGA 

Sicf SS? iSS)DD HPmiRSIPY R.QLYFNCPN 

Consensus — I — PN 

700 

651 n NINKLDYRKE IAYNGWFGET 

SSgSIGA s^eipkns GESNE^YM ^SDE^KRN VMNHINNEBM nctwyp^ 
^931^ sSsEEIW GESNNSWLYM GTEKM)AQKN AMNHINNEBM NGEWYF^ 
^isiS sSsSpSs GESNENWLYM GKTSDEAKRN VMNHINNERM 
SiST S^ELPQNS GESNENWLYM GRTSDEAKPN V^INN^4 NGF^YF^ 
Consensus 1 N N 1VJ 



Hap 

HK368IGA 

HK393IGA 

HK715IQV 

HK61IGA, 

Consensus 



701 

D.KNKHNGRL 
EGK. .NNGNL 
EGK. .NNGSL 
EGK. .NNGNL 
ETKZVTOQGKL 
— K NG-L 



NLIYKPTTED 
NVTFKGKSEQ 
NVTFKGKSEQ 
NVTFKGKSEQ 
NVTFNGKSDQ 



RTLLLSGGTN 
NRFLLTGGTN 
NRFLLTGGTN 
NRFLLTGGTN 
NRFLLTGGTN 
LL-GGTN 



LKGDITQTKG 
LNGDLTVEKG 
LNGDLNVQQfG 
LNGDLKVEKG 
LNGDLNVEKG 
L-GD G 



750 

KLFFSGRPTP 
TLFLSGRPTP 
TLFLSGRPTP 
TLFLSGRPTP 
TLFLSGRPTP 
-LF-SGRPTP 




800 



751 



Hap 

HK368IGA 

HK393IGA, 

HK715IGA 

HK61IGA. 

Consensus 



Hap 

HK368IG7V 

HK393IGA 

HK715IGA 

HK61IGA 

Consensus 



Hap 

HK368IGA 

HK393IGA. 

HK715IC3V 

HK61IGA 

Consensus 



Hap 

HK368IG& 

HK393IGZV 

HK715IGA 

HK61IGA 

Consensus 



Hap 

HK368IGA. 
HK393IGA 
HK715IGA 
HK61IGA 

Consensus 



HAYNHLNKRW SEMEG. .IPQ GEIVWDHDWI *^FK?^Q S^SIg 
u^wvrarT*;*; TKKTiPHFAEN NEVWEDDWI NRNFKATIMN VTGNAbL*^ 
S^SSS ^WVEDDWI NRNFKATNIN VTNNATLYSG 

JSSSS NEVWEDDWI NRNFKA.TNIN VTNNATLYSG 

Si !SS S£™ ™™ 



HA- 



= = = SS 

BNVESITSNI TASNNAKVHI GY. .KAGDTV CVRSDYTGYV TCTTO^U- 
S^ITSNI TASDNAKVHI GY. .KAGDTV CVRSDYTGYV TCTTK^SD. 
SS TASNNAQ7HI GY..KTGDTV CVRSDYTGYV TONSNL^. 
RNV — I — N- T-S— A G T- C^ur ^ 

900 

KVINSIPKTQ INGSINLTDN ATANVKGLAK I20GNVTLTNH SQFTLSNNAT 

KALNSENPTN LRGNVNLTES A 

KALNSFNPTN IPGNVNLTES A 

KALNSENATN VSGNVNLSGN A 

KALNSFNPTN LRGNVNLTEN A - • " 1111111111 

K — NS T G — NL A 



850 



QIGNIRLSDN STATVDNANL NGNVHLTDSA QFSLKNSHFS HQIQGDKGTT 
U .NFVLGKANL FGTIQSRGNS QVRLT 

NFVLGKANL FGTIQSRGNS QVRLT 

NFVLGKANL FGTISGTGNS QVRLT 

" * . SFTLGKANL FGTIQSIGTS QVNLK. . . . . ^^.111111 
1111111111 1 MIL -G Q" -L 



950 



VTLENATWTM PSDTTLQNLT LNNSTITLNS ^SNNTP RRRSLETETT 

ENSHWHL TGNSDVHQLD LANGHIHLMS ADNSNNVTK 

* " ' ENSHWHL TGNSDVHQLD LANGHIHLNS ADNSNNVTK 

• " ENSHWHL TGDSNVNQLU LDKGHIHLNA Q^ANKVTT 

ENSHWHL TGNSNVNQLN LTNGHIHLNA QNDANKVTT^ — 
IHeN—W L- L I-LN 



1000 




1001 

Hap PTSAEHRFNT LTVNGKLSGQ GTFQFTSSLF GYKSDKLKLS 

HK368IGA YNT LTVNS.LSGN GSFYYLTDLS NKQGDKVWT 

HK393IGA YNT LTVNS.LSG* GSFYYLTDLS NKQGDKVWT 

HK715IGA YNT LTVNS.LSGN GSFYYLTDLS NKQ3DKWVT 

HK61IGA YNT LTVNS.LSGN GSFYYWVDFT NNKSNKWVN 

Consensus — NT LTVN— LSG- G-F :. K 



Hap 

HK368IGA 
HK393IGA 
HK715IGA 
HK61IGA 

Consensus 



Hap 

HK368IGA 
HK393IGA 
HK715IGA 
HK61IGA 

Consensus 



Hap 

HK368IGA 

HK393IGA 

HK715IGA 

HK61IGA 

Consensus 



Hap 

HK368IGA 

HK393IGA 

HK715IGA 

HK61IGA 

Consensus 



1051 

VRNTGKEPET 
VADKTGEPNH 
VADKTGEPNH 
VADKTGEPTK 
VADKTGEPNH 
V EP — 

1101 

DGEFRLHNPI 
NGEYDLXNP. 
NGE^YDLYNP. 
NGRYDLYNP. 
NGRYDLYNP. 
-G L-NP- 

1151 



LEQLTLVESK 
.NELTLFDAS 
.NELTLFDAS 
.NELTLFDAS 
.NELTLFDAS 
LTL 



DNQPLSDKLK 
KAQR. .DHLN 
KAQR. .DHLN 
NATR. .NNLN 
NATR. .NNLE 
L- 



FTLENDHVDA 
VSLVGNTVDL 
VSLVQSITVDL 
VSLVGNTVDL 
VTLANGSVDR 
— L VD- 



1050 
NDAEGDYILS 
KSATGNFTLQ 
KSATQ^ETLQ 
KSATGNFTLQ 
KSATGNFTLQ 
— A-G L- 

1100 
GALRYKLVKN 
GAWKYKIPNV 
GAWKYKLRNV 
GAWKYKLRNV 
GAWKYKLRNV 
GA — YKL 



1150 

KEQELHNDLV 

.EVEKRNQTV DTTNITTPNN IQADVPSVPS NNEEIARVDE 
.EVEKRNQTV DTTNITTPNN IQADVPSVPS NNEEIARVDE 
.EVEKPNQTV DTTNITTPNN IQADVPSVPS NNEEIARV.E 
.EVEKRNQTV DTTNITTPND IQADAPSAQS NNEEIARV.E 
-E-E — N — V 

1200 



APVPPPAPAT 

APVPPPAPAT 

TPVPPPAPAT 

TPVPPPAPAT ESAIASEQPE TRPAETAQPA MEETNTANST ETAPKSDTAT 



1201 



1250 

RAEQAERTLE AKQVEPT 

. . . . . . . . . . PSETTETVAE NSKQESKTVE KNSQDATETT AQNREVAKEA 

[[[ PSETTETVAE NSKQESKTVE KNEQDATETT AQNREVAKEA 

]\ PSETTETVAE NSKQESKTVE KNEQDATETT AQNGEVAEEA 

QTCNPNSESV PSETTEKVAE NPPQE3SETVA KNEQEATEPT PQNGEVAKED 



FIGURE 7E 



1300 

1251 -. 

Sgsiga ^^S?^^^^ """"-•• I ".".".Skive 

HK393IGA KSNVKANTQT ISSS^E SSSS" ETAKVE 

Consensus A-TQT E 

1350 

1301 

HK368IGA KEEK .".*.*.*.*. 

HK393IGA KEEK. ^ . .~ ^^J^** Ag^W^T SPKQAKPAPK EVSTDTKVEE 

S!^ S «««« 

Consensus 

1400 



1351 



Hap 

HK368IGA 

HK393IGA 

HEC715IG& 

HK61IG& 

Consensus 



Hap 

HK368IG& 

HK393IGA. 

HK715IGA 

HK61IGA. 

Consensus 



Hap 

HK368IQV 

HK393IGA. 

HK715IQV 

HK61IGA. 

Consensus 



•^MOTO STTV^ARERT SENSKBEEr IcpSEKMMS P**™^ 

SPNSKPAEET QQPfEKT^E PWFWS^ 



1401 PKVRS RRAARAAFPD TIP. 

iKVETE KTCSWPKVTS QVSPKQEQSE T. . . 

.AKVETE KTQEVPKVTS QVSPKQEQSE T. . 



1450 




P-V-S 



1451 



1500 
V 

• • • • v 

V 

V 



&>i&£Li TK^QPQ AQPQ^STAV PTi^T^S KPAAkW 




1550 

1501 n OSIMHffl KQAEL TAETQKSKAK TKK. . . , - - 

Hap D ^rJffrTx^o OSOINT TADTEQPAKE TSSNVE 

ess ffSsS==- 

1 M — E — w* * 

Consensus 

1600 

1551 V RSKBAVFSDP LLDQSL 

Hap ICPVT ESTTVNTGNS WEN 

HK368IGA. ESTTVNTGNS WEN 

HK393IGA • • 'L^^ ^ETAAST EDASQHKANT VADNSVANNS 

S 5 !^ ENSl^ SgEvi KND^EANT VADNS^NNS 

Consensus 

1650 

1601 F aieaALEVID APQQSEKDBL AQEEAEKQBK 

Hap PENTTPATTQ PTVNSESSN. .KPK.NBHBR 

HK368IGA PENTTPATTQ PTVNSESSN. .KPK.NBHRR 

HK393IGA nkrkc^TS AEETTAASTD ETTIADNSKR SKPN.PRSRR 

Consensus 

1700 

I 651 QKDLI SRYSNSALSE 

^VPHNVEPATTSS^>:: ! *. ! \ \ \ ^VAI^^ 
7PHNVE 

SV^TN^EP? EI^nSnA ENVQSGNNVA NS^BNLT SW^TNAVLSN 



Hap 

HK368IGA 
HK393IGA 
HK715IGA 
HK61IGA 
Consensus 



— lis'- 1 



Hap 

HK368IGA 

HK393IGA 

HK715IGA. 

HK61IGA, 

Consensus 



1701 

LSA TV 

ARAKACFVAL 

AMAKAQFVAL 
— A 



NSMLSVQDEL 
NVGKAVSQHI 
NVGKAVSQHI 
NVGKAVSQHI 
NVGKAVSQHI 
N V 



DRL-FVDQAQ 
SQLEMNNBGQ 
SQI£MNNEX3Q 

SQIEMNNEGQ 
— L Q 



SAVWTNIXP 
YNVWVSNTSM 
YNVWVSNTSM 
YNVWVSNTSM 
YNVWISNTSM 
— VW 



1750 
KRRYDSDAFR 
NKNYSSSQYR 
NKNYSSSQYR 
NENYSSSQYR 
NKNYSSEQYR 
Y-S R 



FIGURE 7G 



Hap 

HK368IGA 

HK393IGA 

HK715IGA 

HK61IGA 

Consensus 



Hap 

HK368IGA 

HK393IGA 

HK715IGA 

HK61IGA 

Consensus 



Hap 

HK368IGA 

HK393IGA 

HK715IGA 

HK61IGA. 

Consensus 



Hap 

HK368IGA 

HK393IGA 

HK715IGA 

HK61IGA 

Consensus 



Hap 

HK368IGA 

HK393IG?V 

HK715IGA 

HK61IGA 

Consensus 



1751 

AYQQQKTNLR 
RFSSKSTQfTQ 
RFSSKSTQfTQ 
RFSSKSTQTQ 
RFSSKSTQfTQ 



QIGVQKALAN 
U3WDQTISNN 
LGWDQTISNN 
UGWDQfTISNN 
UGWDQTISNN 
Q N 



GRIGAVFSHS 
VQUGGVFTYV 
VQIJGGVFTYV 
VQLGGVFTYV 
VQLGGVFTYV 
G-VF 



RSDNTFDEQV 
RNSNNFDKAT 
RNSNNFDKAT 
RNSNNFDKAS 
RNSNNFDKAS 
K — N-FD 



1800 
KNHATLTmS • 
SKN.TLAQVN 
SKN.TLAQVN 
SKN.TLAQVN 
SKN.TLAQVN 
TL 



1801 

GFAQYQflSDL 
FYSKY.YADN 
FYSKY.YADN 
FYSKY.YADN 
FYSKY.YADN 
Y D- 



QF. .GVNVGT 
HWYLGIDDGY 
HWYU3IDUGY 
HWYLGIDLGY 
HWYLGIDLGY 



GISASKMAEE 
GKFQSKLQTN 
GKFQSKLQfTN 
GKFQSNLKTN 
GKFQSNLQTN 

G S- 



qsrklhrkai 
hnakfarhta 
hnakfarhta 
hnakfarhta 
nnakfarhta 



1850 
NYGVNASYQF 
QFGLTAGKAF 
QFGLTAGKAF 
QFGLTAGKAF 
Q1GLTAGKAF 



K — R G— A— F 



1851 

RLGQLGIQPY 
NLGNFGTEPI 
ND3JFGTTPI 
NL/33FGITPI 
NL/33FAVKPT 

-LG P- 



FGVNRYFIER 
VGVRYSYLSN 
VGVRYSYLSN 
VGVRYSYLSN 
VGVRYSYLSN 
-GV 



ENYQSEEVRV 
ADFALDQARI 
ADFALDQARI 
ANFAIAKDRI 

ADFALACPRI 
R- 



1900 

KTPSLAFNRY NAGIRVDYTF 
KVNPISVKTA FAQVDLSYTY 
KVNPISVKTA FAQVDLSYTY 
KVNPISVKTA FAQVDLSYTY 
KVNPISVKTA FAQVDLSYTY 
K _ -A- YT- 



1901 

TPTDNISVKP 
.HLGEFSVTP 
.HLGEFSVTP 
.HLGEFSVTP 
.HLGEFSITP 
S~P 



YFFVNYVDVS NANVQTTVNL 
ILSARY.DAN QGSGKINVNG 
ILSARY.DAN QGSGKINVNG 
ILSARY.DTN QGSGKINVNQ 
ILSARY.DAN QQ^GKINVSV 
Y-D V— 



TVLQQPFGRY 
YDFAYNVENQ 
YDFAYNVENQ 
YDFAYNVE3SK2 
YDFAYNVENQ 



1950 
VJQKEVGLKAE 
QQYNAGLKLK 
QQYNAGLKLK 
QQYNAGLKLK 
QQYNAGLKLK 
-Q GLK — 



1951 

ILUFQISAFI 
YHNVKLSLIG 
YHNVKLSLIG 
YHNVKLSLIG 
YHNVKLSLIG 



1982 

SKSQGSQLGK QQNVGVKL^Y RW 
GLTKAKQAEK QKTAELKLSF SF 
GLTKAKQAEK QKTAELKLSF SF 
GLTKAKQAEK QKTAELKLSF SF 
GLTKAKQAEK QKTAEVKLSF SF 
q — K Q KL 




105.1 kD 
69.8 



43.3 



28.3 



18.1 



FIGURE 8 



50 



HapNl87 
HapTN106 
Hap860295 
Consensus 

HapN187 
HapTN106 
Hap860295 
Consensus 

HapNl87 
HapTN106 
Hap860295 
Consensus 

HapN187 
HapTN106 
Hap860295 
Consensus 

HapNl87 
HapTN106 
Hap860295 
Consensus 

HapNl87 
HapTN106 
Hap860295 
Consensus 

HapN187 
HapTNl06 
Hap860295 
Consensus 

HapN187 
HapTN106 
Hap860295 
Consensus 

HapN187 
HapTNl06 
Hap860295 
Consensus 



(1) 
(1) 
(1) 
(1) 



MKKTV FRLN FLTAC I S LG I V S QAWAGHT Y FG I D YQY YRDFAENKGKF VG 




100 



(101) 
(101) 
(101) 
(101) 

(151) 
(149) 
(151) 
(151) 

(196) 
(195) 
(201) 
(201) 

(246) 
(245) 
(251) 
(251) 

(295) 
(294) 
(301) 
(301) 

(340) 
(344) 
(346) 
(351) 

(387) 
(392) 
(396) 
(401) 



PM D S 



R G 



mmm 




Ikdnlh 



In 

gy" vdfg eg npdqhrf y vkr nyk 

151 

M G Y YP RVR G G Q W 




|S| — DRQYNraQH 

Ihgi 



DY 





PRL KFV 
200 

nn|tiH 

EEQKQ|pKSSW 
G 

250 





PMFIYD KWL NG L 
301 

slytrag |P v1t1sg1d! 
n|wdtnaeyrfn|g1dh 

TgLEPRS|l§H§|SFT 
351 

-e|d|vhn|N 

RD i N sl ES l s 

al|e^ke|v 

K P Y 

401 



S GDSGS 
300 

i s ^ DE_ H ER H HT 

bMQG-pNQ^TA 

w|ydnv|veHpi 

F D 

350 

; QiSI^^SGlffiE|KITLA|MSlPLK 

Irvat i ksIlpkIai qKr|vgl y d§s|1h DA 

| T^V^TNEK^MPQFKVRTV^FNE 

L 

400 

1 




FASDffl 



mam 



YF D 



INQGAG 
450 




|pgsg-oaig 

i KG iQi Ni ^ 

|SEN|j-A|| 
GLYFEGNF V ' N TWQGAG 




V E STV W V E DRLSKIG G 



FIG. 11A 



451 



HapN187 (436 

HapTN106 (442 

Hap860295 (445 

Consensus (451 

HapN187 (486 

HapTN106 (492 

Hap860295 (495 

Consensus (501 

HapN187 (536 

HapTN106 (542 

Hap860295 (545 

Consensus (551 

HapN187 (585 

HapTN106 (587 

Hap860295 (595 

Consensus (601 

HapN187 (635 

HapTN106 (637 

Hap860295 (643 

Consensus (651 

HapN187 (685 

HapTN106 (687 

Hap860295 (693 

Consensus (701 

HapN187 (735 

HapTN106 (737 

Hap860295 (743 

Consensus (751 

HapN187 (785 

HapTN106 (787 

Hap860295 (793 

Consensus (801 

HapN187 (835 

HapTN106 (837 

Hap860295 (843 

Consensus (851 




KG N GS SVG G V L QQAD KQAF E G VSGR TVQL 
^_ 550 



N Q D Y FG FRGGRL DLNGH S LT F RIQNTDEGAMIVNHN Q A 
551 600 

—- /lI-Sg! 








S DSKQgTNK 

'AisiK; 



N TITGN I 
601 




L K IA NGWFGE D K NGRLN Y 

650 



P LLLSGGTNL G ITQ G L FSGRPTPHAYNHL 

651 



700 



£§Tf; \ "SrkfSim} % f«f* pvf?1 













G PQGE V D DWI RTFKAENFQIKGGSAWSRNVSSIEGNWTVSNNAN 
701 750 




A FG WPNQQNT I CTRS DWTGLTTC VDLTD KVINSIP TQINGSINL 
751 800 




TDNAT N GLAKLNGNVTL HSQFTLSNNATQ GNI LS ATVDN 
801 850 

m 




ANLNGNV L DSAQFSLKNSHFSHQIQG TTV LENATWTMPSD TLQ 
851 900 
III 




„ i 

NLTLNNST TLNSAYSA SNN PR RRSLETETTPTSAEHRFNTLTVNG 



FIG. 11B 



901 



950 



HapN187 
HapTNl06 
Hap860295 
Consensus 

HapN187 
HapTNl06 
Hap860295 
Consensus 

HapN187 
HapTN10 6 
Hap860295 
Consensus 

HapN187 
HapTN106 
Hap860295 
Consensus 

HapN187 
HapTN106 
Hap860295 
Consensus 

HapN187 
HapTNl06 
Hap860295 
Consensus 

HapN187 
HapTN106 
Hap860295 
Consensus 

HapN187 
HapTN106 
Hap860295 
Consensus 

HapN187 
HapTN106 
Hap860295 
Consensus 



(885) 
(887) 
(892) 
(901) 

(935) 
(937) 
(942) 
(951) 

(985) 
(987) 
(992) 
(1001) 

(1035) 
(1034) 
(1039) 
(1051) 

(1076) 
(1073) 
(1085) 
(1101) 

(1091) 
(1086) 
(1135) 
(1151) 

(1140) 
(1136) 
(1181) 
(1201) 

(1190) 
(1186) 
(1231) 
(1251) 

(1240) 
(1236) 
(1281) 
(1301) 



KLSGQGTFQFTSSLFGYKSDKLKLSNDAEGDY LS VRNTGKE P UL1 
951 1000 
H 






LVESKDN PLSDKL FT LEN DH VDAG ALR YKL VKN GEFRLHNPlKfc^k-L. 
1001 1° 50 

DLVRAEQAERTLEAKQVE TA TQT VRS RA F D LP QS 

1051 1100 

_ LDVLQA 

iT L A Q ~~ T E Q K KKVRSKRA FSD L DQ 
1101 1150 

iFAj 

I 

EQ VE PT AE KQKN KAKKVRS KRAARE FSDTPLDLSR j§KVj 

L LEVI A Q 

1151 _ 1200 

Is eIdrlaI ^ a s 
|t^^e^r 
|d|| — fjl~ — - 

Q K Q EK QRKQK LISRYSNSALSELSATVNSMLSVQDELDR 
1201 1250 




LFVDQAQSAVWTNIAQDKRRYDSDAFRAYQQKTNLRQIGVQKAL NGRIG 
1251 1300 




AVFSHSRSDNTFDEQVKNHATL MM S G FAQ YQWG DLQ FG VN VG GISASK 
1301 1350 




MAEEQSRKIHRKAINYGVNASYQFRLGQLGIQPY GVNRYFIERENYQSE 



FIG. 11C 



1400 



HapN187 


(1290) 


HapTN106 


(1286) 


Hap860295 


(1331) 


Consensus 


(1351) 


HapN187 


(1340) 


HapTN106 


(1336) 


Hap860295 


(1381) 


Consensus 


(1401) 


HapN187 


(1390) 


HapTN106 


(1386) 


Hap860295 


(1431) 


Consensus 


(1451) 



OMR 



EV V 
1401 



TPSL FNR YN AG I RV D YT FT PT DN I S 



KPYFFVNYVDVSNANVQI 
1450 




LQQ FGRYWQKEVGLKAEILHFQ S AFI SKSQGSQLGKQQNVGV 



KLGYRW 



FIG. 11D 



^-kUZ>-(^e)± i^-do rKUI'i:riULb I*UCKU 



10:415 3243 



P. 006 '009 



A B 

1 2 3 4 1 2 3 4 

221 kDa 221 kDa 



96.7 
71.8 



96.7 
71.8 



96.7 kDa 

71.8 

45.5 




28.6 
19.7 



Nucleotide sequence for NTHi strain 11 hap gene (start 
stop codon) : 



1 ATGAAAAAAA CTGTATTTCG TCTTAATTTT TTAACCGCTT GCATTTCATT 
51 AGGGATAGTA TCGCAAGCGT GGGCAGGT C A TACTTATTTT GGGATTGACT 
101 ACCAATATTA TCGTGATTTT GCCGAGAATG AAGGCAAGTT TGCAGTTGGG 
151 GCTAAAAATA TTGATGTTTA TAACAAAGAA GGGCAATTAG TTGGCACAT C 
201 AATGACAAAA GCCCCGATGA TTGATTTCTC AGTCGTTTCC AGAAATGGAG 
251 TTGCTGCCTT AGTAGGCGAT CAGTATATTG TGAGTGTGGC ACATAATGTA 
3 01 GGCTATACCA ATGTGGATTT TGGTGCTGAA GGACAAAATC CTGATCAACA 
351 TCGTTTTACT TATAAAATTG TGAAACGGAA TAATTATAAT CACGATGCGA 
401 AGCACCGCTA TCTAGATGAC TACCATAATC CACGTTTACA TAAATTTGTA 
451 ACGGATGCGG CACCAATTGA TATGACTTCA CATATGGATG GCAATAAGTA 
501 TGCAAATAAG GAAAAATATC CTGAACGAGT ACGCGTCGGA TCTGGAGATC 
551 AGTATTGGGA TGACGATCAA AACAACAGAA CTTATTTATC TGACGGATAT 
601 AATTATTTAA CAGGTGGGAA TACATATAAT CAAAGCGGTA GAGGTGATGG 
651 ATATTCATAT GTGAGAGGTG ATATTCGCAA AGTTGGCGAT TATGGTCCAT 
701 TACCGATTGC AAGTTCATTC GGGGACAGTG GATCTCCAAT GTTTATTTAT 
751 GATGCTGAAA CACAAAAATG gcTAATTAAT GGAGTATTGC GGGAGGGGCA 
801 ACCTTATACA GGCGAATTCG ATGGATTT C A ATTAGCCCGT AAATCTTTCC 
851 TTGATGAAAT TATACGCAAA GATCAACCAA ATGGTTTTTT AACCCCTAAG 
901 GGGAATGGCG TTTATACCAT TTCTAAAAGT GACGATGGGA TAGGAGTTGT 
951 TACTTCGAAA ATTGGAAAAC CTCGTGAAAT ACCTTTAGCG AACAACAAAT 
1001 TAAAAATAGA AGATAAAGAT ACTGT CTATA ATAACAGATA TAATGGTCCT 
1051 AATATTTATT CTCCTCAATT AAACAATGGC AAGAATATTT ATTTTGGAGA 
1101 TGAAGAATTA GGATCCATAA CTTTAACGAC TGATATCGAT CAAGGTGCAG 
1151 GCGGTTTGTA TTTTGAGGGG GATTTTATAG TTTCGCCTAC CAAAAATGAA 
1201 ACGTGGAAAG GTGCGGGCAT TCATGTCAGT GAAATTAGTA CCGTTACTTG 
1251 GAAAGTAAAC GGCGTGGAAA ATGATCGACT TTCTAAAATC GGTAAAGGAA 
1301 CATTACACGT TAAAGCCAAA GGGGAAAATA AAGGTTCGAT CAGCGTAGGC 
1351 GATGGTAAAG TCATTTTGGA GCAGCAGGCA GACGATCAAG GCAACAAACA 
14 01 AGCCTTTAGT GAAATTGGCT TGGTTAGCGG CAGAGGGACT GTTCAATTAA 
1451 ACGATGATAA ACAATTTGAT ACCGATAAAT TTTATTTCGG CTTTCGTGGT 
1501 GGTCGCTTAG ATCTTAACGG ACATTCATTA ACCTTTAAAC GTATCCAAAA 
1551 TACGGACGAG GGGGCGATGA TTGTGAACCA TAATACAACT CAAGTCGCTA 
1601 ATATTACTAT TACTGGGAAC GAAAGTATTA CTGCTCCATC TAATAAAAAT 
1651 AATATTAATA AACTTGATTA CAGCAAAGAA ATTGCCTACA ACGGCTGGTT 
1701 TNGCGAAACA GATAAAAATA AACATAATGG ACGATTAAAC CTTATTTATA 
1751 AACCAACCAC AGAAGAT CGT ACTTTGCTAC TTTCAGGCGG CACAAACTTA 
1801 AAAGGCGATA TTACT CAAAC AAAAGGTAAA CTATTTTTCA GCGGTAGACC 
1851 GACACCCCAC GCCTACAATC ATTTAGACAA ACGTTGGTCA GAAATGGAAG 
1901 GTATCCCACA AGGCGAAATT GTGTGGGATT ACGATTGGAT TAACCGCACA 
1951 TTTAAAGCTG AAAACTTCCA AATTAAAGGC GGAAGTGCGG TGGTTTCTCG 
2001 CAATGTTTCT TCAATTGAGG GAAATTGGAC AGTCAGCAAT AATGCAAATG 



Fig. 16A 



2051 CCACATTTGG TGTTGTGCCA AATCAGCAAA ATACCATTTG CACGCGTTCA 
2101 GATTGGACAG GATTAACGAC TTGTAAAACA GTTAATTTAA CCGATAAAAA 
2151 AGTTATTGAT TCCATACCGA CAACACAAAT TAATGGTTCT ATTAATTTAA 
2201 CTGATAATGC AACAGTGAAT ATTAATGGTT TAGCAAAACT TAATGGTAAT 
2251 GTCACTTTAA TAAATCATAG CCAATTTACA TTGAGCAACA ATGCCACCCA 
2301 AATAGGCAAT ATCAAACTTT CAAATCACGC AAATGCAAGG GTAAATAATG 
2351 CCACTTTAAT GGGCGATGTG AATTTAGCGG ATACTAGCCG TTTTACATTA 
2401 AGCAATCAAG CAACACAGAT TGGCACAATC AGTCTTCATC AGCAAGCTCA 
2451 AGCAACAGTG GATAATGCAA ACTTGAACGG TAATGTGCAT TTAACGGATT 
2501 CTGCCAGATT TTCTTTAAAA AACAGTCATT TTTCGCACCA AATTCAGGGC 
2551 GACAAAGACA CAACAGTGAC GTTGGAAAAT GCGACTTGGA CAATGCCTAG 
2601 CGATACTACA TTGCAGAATT TAACGCTAAA TAATAGTACT GTTACGTTAA 
2651 ATTCAGCTTA TTCAGCTAGC T CAAAT AATG CGCCACGTCG CCgCCGTTCA 
2701 TTAGAGACGG AAACAACGCC AACATCGGCA GAACAT CGTT T C AAC AC ATT 
2751 GACAGTAAAT GGTAAATTGA GCGGGCAAGG CACATTCCAA TTTACTCCAT 
2801 CTTTATTTGG CTATGAAAGC GATAAATTAA AATTATCCAA TGACGCTGAG 
2851 GGCGATTACA CATTATCTGT TCGCAACACA GGCAAAGAAC CCGTGACCCT 
2901 TGAGCAATTA ACTTTGGTTG AAAGCAAAGA TAATAAACCG TTATCAGACA 
2951 AACTCAAATT TACTTTAGAA AATGAC CACG TTGATGCAGG TGCATTACGT 
3 001 TATAAATTAG TGAAGAATAA GGGCGAATTC CGCTTGCATA ACCCAATAAA 
3051 AGAGCAGGAA TTGCGCT CTG ATTTAGTAAG AGCAGAGCAA GCAGAACGAA 
3101 CATTAGAAGC CAAACAAGTT GAACAGACTG CTGAAACACA AACAAGTAAT 
3151 GCAAGAGTGC GGTCAAGAAG AGCGGTGTTG TCTGATACCC CGTCTGCTCA 
3201 AAGCCTGTTA AACGCATTAG AAGTCAAACA AGCTGAACCG AATGCTAAAA 
3251 CACAAAAAAG TAAGGCAAAA ACAAAAAAAG CGCGGTCAAA AAGAGCATTG 
3301 AGAGAAGCGT TTTCTGATAC CCCGCCTGAT CTAAGCCAGT TAAACGTATT 
3351 AGAAGCCGCA CTTAAGGTTA TTAATGCCCA ACCGCAAACA GAAAAAGAAC 
3401 GTCAAGCTCA AGAGGAAGAA GCGAAAAGAC AACGCaAACA AAAAGACTTG 
3451 ATCAGCCGTT ACTCAAATAG TGCGTTATCG GAGTTGT CTG CAACAGTAAA 
3501 TAGTATGCTT TCCGTTCAAG ATGAATTGGA TCGTCTTTTT GTAGATCAAG 
3551 CACAATCTGC CCTGTGGACA AATATCGCAC AGGATAAAAG ACGCTATGAT 
3 601 TCTGATGCGT TCCGTGCTTA TCAGCAGAAA ACGAACTTGC GTCAAATTGG 
3651 GGTGCAAAAA GCCTTAGATA ATGGACGAAT TGGGGCGGTT TTCTCGCATA 
3701 GCCGTTCAGA TAATACCTTT GACGAACAGG TTAAAAATCA CGCGACATTA 
3751 ACGATGATGT CGGGTTTTGC CCAATATCAA TGGGGCGATT TACAATTTGG 
3801 TGTAAACGTG GGCGCGGGAA TTAGTGCGAG TAAAATGGCT GAAGAACAAA 
3851 GCCGAAAAAT TCATCGAAAA GCGATAAATT ATGGTGTGAA TGCAAGTTAT 
3 901 CAGTTCCGTT TAGGGCAATT GGGTATTCAG CCTTATTTGG GTGTTAATCG 

3 951 ATATTTTATT GAACGTGAAA ATTATCAATC TGAAGAAGTG AAAGTGCAAA 

4 001 CACCGAGCCT TGCATTTAAT CGCTATAATG CTGGCATTCG AGTTGATTAT 
4051 ACATTTACCC CGACAGATAA TATCAGCGTT AAGCCTTATT TCTTTGTCAA 
4101 TTATGTTGAT GTTTCAAACG CTAACGTACA AAC C ACTGT A AATAGCACGA 
4151 TGTTGCAACA ATCATTTGGG CGTTATTGGC AAAAAGAAGT GGGATTAAAG 
4 201 GCAGAAATTT TACATTTCCA ACTTTCCGCT TTTATCTCAA AATCTCAAGG 
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42 51 TTCACAACTC GGTAAACAGC AAAATGTGGG CGTGAAATTG GGCTATCGTT 
4301 GGTAA 
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Amino acid sequence for NTHi strain 11 Hap protein (first ami] 
acid to last amino acid) : 

1 MKKTVFRLNF LTACISLGIV SQAWAGHTYF GIDYQYYRDF AENEGKFAVG 
51 AKN I DVYNKE GQLVGTSMTK APMIDFSWS RNGVAALVGD QYIVSVAHNV 
101 GYTNVDFGAE GQNPDQHRFT YKIVKRNNYN HDAKHRYLDD YHNPRLHKFV 
151 TDAAPIDMTS HMDGNKYANK EKYPERVRVG SGDQYWDDDQ NNRTYLSDGY 
201 NYLTGGNTYN QSGRGDGYSY VRGD I RKVGD YGPLPIASSF GDSGSPMFIY 
251 DAETQKWLIN GVLREGQPYT GEFDGFQLAR KSFLDEIIRK DQPNGFLTPK 
301 GNGVYTISKS DDGIGWTSK IGKPREIPLA NNKLKIEDKD TVYNNRYNGP 
351 N I Y S PQLNNG KNIYFGDEEL GSITLTTDID QGAGGLYFEG DFIVSPTKNE 
401 TWKGAGIHVS EISTVTWKVN GVENDRLSKI GKGTLHVKAK GENKGS I SVG 
451 DGKVILEQQA DDQGNKQAFS EIGLVSGRGT VQLNDDKQFD TDKFYFGFRG 
501 GRLDLNGHSL TFKRIQNTDE GAMIVNHNTT QVANITITGN ESITAPSNKN 
551 NINKLDYSKE IAYNGWFXET DKNKHNGRLN LIYKPTTEDR TLLLSGGTNL 
601 KGD I TQTKGK LFFSGRPTPH AYNHLDKRWS EMEGIPQGEI VWDYDWINRT 
651 FKAENFQIKG GSAWSRNVS SIEGNWTVSN NANATFGWP NQQNTICTRS 
701 DWTGLTTCKT VNLTDKKVID SIPTTQINGS I NLTDNATVN I NGLAKLNGN 
751 VTLINHSQFT LSNNATQIGN IKLSNHANAR VNNATLMGDV NLADTSRFTL 
801 SNQATQIGTI SLHQQAQATV DNANLNGNVH LTDSARFSLK NSHFSHQIQG 
851 DKDTTVTLEN ATWTMPSDTT LQNLTLNNST VTLNSAYSAS SNNAPRRRRS 
901 LETETTPTSA EHRFNTLTVN GKLSGQGTFQ FTPSLFGYES DKLKLSNDAE 
951 GDYTLSVRNT GKE PVTLEQL TLVESKDNKP LSDKLKFTLE NDHVDAGALR 
1001 YKLVKNKGEF RLHNPIKEQE LRSDLVRAEQ AERTLEAKQV EQTAETQTSN 
1051 ARVRSRRAVL SDTPSAQSLL NALEVKQAEP NAKTQKS KAK TKKARSKRAL 
1101 REAFSDTPPD LSQLNVLEAA LKVINAQPQT EKERQAQEEE AKRQRKQKDL 
1151 ISRYSNSALS ELSATVNSML SVQDELDRLF VDQAQSALWT NIAQDKRRYD 
1201 SDAFRAYQQK TNLRQ IGVQK ALDNGR I GAV FSHSRSDNTF DEQVKNHATL 
1251 TMMSGFAQYQ WGDLQFGVNV GAG I S AS KMA EEQSRKIHRK AINYGVNASY 
13 01 QFRLGQLGIQ PYLGVNRYFI ERENYQSEEV KVQTPSLAFN RYNAGIRVDY 
1351 TFTPTDNISV KPYFFVNYVD VSNANVQTTV NSTMLQQSFG RYWQKEVGLK 
1401 AEILHFQLSA FISKSQGSQL GKQQNVGVKL GYRW 



Fig. 17 



Nucleotide sequence for NTHi strain TN106 hap gene (start codon 
begins at position 422, stop codon begins at position 4595) : 



1 
X 


X VjrvJJv^VJvjv^v7w.r"l 


CAAATTATTG 


CGACGGGTAC 


ACCAGAACAA 


GTTGCTAAAG 


r> X 


t a a a a zvnTTP 


CCACACCGCT 


CGCTTCCTTA 


AACCGATTTT 


AGAAAAACCT 


1U X 


Tnpa a a a a at 


GACCGCACTT 


TCAGAGAAAA 


CT CACAT AAA 


GTGCGGTTAT 


TCI 

XbX 


T T 1 t a tt a nTn 


ATATTGTTTT 

xx A XX J- x vj J. J» -i. x. 


AATTTTAGTT 


ATCTGTATAA 


ATTACATACA 




A T Zi TT A A TPP 


AT CGCAAGAT 


TAGATTACCC 


ACTAAGTATT 


AAGCAAAAAC 


ZDJ. 


V_ X /ivjrirtri xxx 


TGGCTTAATT 


ACTATATAGT 


TTTACTCATT 


TATTTTCTTT 


■DAI 
O U X 


IvJXVJV-V-l X X A 


AGTTCATTTT 


TTTAGCTGAA 


ATCCCTTAGA 


AAATCACCGC 


C, 1 

,3 3 X 


A PTTTT ATTG 
x x x x n x x vj 


TTCAATAGTC 


GTTTAACCAC 


GTATTTTTTA 


ATACGAAAAA 


H VJ X 


TTAPTTAATT 
x x xxv*. x j. *w -i. x 


AAATAAACAT 


TATGAAAAAA 


ACTGTATTTC 


GTCTGAATTT 




TTTAACCGCT 

XXX XiXi V— V— V«» X 


TGCATTTCAT 


TAGGGATAGT 


ATCGCAAGCG 


TGGGCAGGTC 


DUX 


ATAPTTATTT 

.rA. X xx V— X X xi. X X J. 


TGGGATTGAC 


TACCAATATT 


ATCGTGATTT 


TGCCGAGAAT 


DO X 


A A ARfSCiA ART 

x*Ajri-rAVJrvjvjf\^-vv7 x 


TTACAGTTGG 


GGCTCAAGAT 


ATTGATATCT 


ACAATAAAAA 


DUX 


rivjvjvjfvJ/irvri x vj 


ATAGGTACGA 


TGATGAAAGG 


TGTGC CTATG 


CCTGATTTAT 


651 


CTTCCATGGT 


TCGTGGTGGT 


TATTCAACAT 


TGATAAGTGA 


GCAGCATTTA 


701 


ATTAGCGTCG 


CACATAATGT 


AGGGTATGAT 


GTCGTTGATT 


TTGGTATGGA 


751 


GGGGGAAAAT 


CCAGACCAAC 


ATCGTTTTAA 


GTATAAAGTT 


GTTAAACGAT 


801 


ATAATTATAA 


GAGCGGTGAT 


AGACAATATA 


ATGATTATCA 


ACATCCAAGA 


851 


TTAGAGAAAT 


TTGTAACGGA 


AACTGCACCT 


ATTGAAATGG 


TTT CATATAT 


901 


GGATGGTAAT 


CATTACAAAA 


ATTTTAATCA 


ATATCCTTTG 


CGAGTTAGAG 


951 


TTGGAAGTGG 


GCAT CAATGG 


TGGAAAGACG 


ATAATAATAA 


AACCATTGGA 



1001 GACTTAGCCT ATGGAGGTTC ATGGTTAATA GGTGGAAATA CCTTTGAAGA 

10 51 TGGACCAGCT GGTAACGGTA CATTAGAATT AAATGGGCGA GTACAAAATC 

1101 CTAATAAATA TGGTCCACTA CCTACGGCAG GTTCATTCGG GGATAGTGGT 

1151 TCTCCAATGT TTATTTATGA TAAGGAAGTT AAGAAATGGT TATTAAATGG 

12 01 CGTGTTACGT GAAGGAAATC CTTATGCTGC AGTAGGAAAC AGCTAT C AAA 

12 51 TTACACGAAA AGATTATTTT CAAGGTATTC TTAAT CAAGA CATTACAGCT 

1301 AATTTTTGGG ATACTAATGC TGAATATAGA TTTAATATAG GGAGTGACCA 

1351 CAATGG AAGA GTGGCAACAA T C AAAAGT AC ATTACCTAAA AAAGCT ATT C 

14 01 AGCCTGAACG AATAGTGGGT CTTTATGATA AT AGC CAACT TCATGATGCT 

1451 AGAGATAAAA ATGGCGATGA ATCTCCCTCT TATAAAGGTC CTAATCCATG 

1501 GTCGCCAGCA TTACATCATG GGAAAAGTAT TTACTTTGGC GAT CAAGGAA 

1551 CAGGAACTTT AACAATTGAA AATAATATAA AT CAAGGTGC AGGTGGATTG 

1601 TATTTTGAAG GTAATTTTGT TGTAAAAGGC AATCAAAATA ATATAACTTG 

1651 GCAAGGTGCA GGCGTTTCTG TTGGAGAAGA AAGTACTGTT GAATGGCAGG 

1701 TGCATAATCC AGAAGGCGAT CGCTTATCCA AAATTGGGCT GGGAAC CTT A 

1751 CTTGTTAATG GTAAAGGGAA AAACTTAGGA AGCCTGAGTG TCGGTAACGG 

1801 TTTGGTTGTG TTAGATCAAC AAGCAGATGA ATCAGGTCAA AAACAAGCCT 

1851 TTAAAGAAGT TGGCATTGTA AGTGGTAGAG CTACCGTTCA ACTAAATAGT 

1901 GCAGATCAAG TTGATCCTAA CAATATTTAT TTCGGCTTTC GTGGTGGTCG 

1951 CTTAGAT CTT AATGGGCATT CATTAACCTT TGAACGTATC CAAAATACGG 

2001 ATGAAGGCGC GATGATTGTG AACCACAACG CTTCTCAAAC CGCAAATATT 



Fig. 18A 



2051 


ACGATTACAG 


GCAACGCAAC 


TATTAATTCA 


GATAGCAAAC 


AACTTACTAA 


2101 


TAAAAAAGAT 


ATTGCATTTA 


ACGGCTGGTT 


TGGTGAGCAA 


GATAAAGCTA 


2151 


AAACAAATGG 


TCGTTTAAAT 


GTGAATTATC 


AACCAGTTAA 


TGCAGAAAAT 


2201 


CATTTGTTGC 


TTTCTGGGGG 


GACAAATTTA 


AACGGCAATA 


TCACGCAAAA 


2251 


TGGTGGTACG 


TTAGTTTTTA 


GTGGTCGTCC 


AACGCCTCAT 


GCTTACAATC 


2301 


ATTTAAGAAG 


AGACTTGTCT 


AACATGGAAG 


GTATCCCACA 


AGGCGAAATT 


2351 


GTGTGGGATC 


ACGATTGGAT 


CAACCGCACA 


TTTAAAGCTG 


AAAACTTCCA 


2401 


AATTAAAGGC 


GGAAGTGCGG 


TGGTTTCTCG 


CAATGTTTCT 


TCAATTGAGG 


2451 


GAAATTGGAC 


AGTCAGCAAT 


AATGCAAATG 


CCACATTTGG 


TGTTGTGCCA 


2501 


AAT CAGCAAA 


AT AC CATTTG 


CACGCGTTCA 


GATTGGACAG 


GATTAACGAC 


2551 


TTGTAAAACA 


GTTGATTTAA 


CCGATAAAAA 


AGTTATTAAT 


TCCATACCGA 


2601 


CAACACAAAT 


TAATGGTTCT 


ATTAATTTAA 


CTGATAATGC 


AACAGTGAAT 


2651 


ATT CATGGTT 


TAGCAAAACT 


TAATGGTAAT 


GTCACTTTAA 


TAGATCACAG 


2701 


CCAATTTACA 


TTGAGCAACA 


ATGCCACCCA 


AACAGGCAAT 


ATCAAACTTT 


2751 


CAAATCACGC 


AAATGCAACG 


GTGGACAATG 


CAAATTTGAA 


CGGTAATGTG 


2801 


AATTTAATGG 


ATTCTGCTCA 


ATTTTCTTTA 


AAAAACAGCC 


ATTTTTCGCA 


2851 


CCAAATCCAA 


GGTGGGGAAG 


ACACAACAGT 


GATGTTGGAA 


AATGCGACTT 


2901 


GGACAATGCC 


TAGCGATACC 


ACATTGCAGA 


ATTTAACGCT 


AAATAATAGT 


2951 


ACTGTTACGT 


TAAATTCAGC 


TTATT CAGCT 


AT CTCAAATA 


ATGCGCCACG 


3001 


CCGTCGCCGC 


CGTTCATTAG 


AGACGGAAAC 


AACGCCAACA 


T CGGCAGAAC 


3051 


ATCGTTTCAA 


CACATTGACA 


GTAAATGGTA 


AATTGAGCGG 


GCAAGGCACA 


3101 


TTCCAATTTA 


CTTCATCTTT 


ATTTGGCTAT 


AAAAGCGATA 


AATTAAAATT 


3151 


ATC CAATGAC 


GCTGAGGGCG 


ATTACACATT 


ATCTGTTCGC 


AACACAGGCA 


3201 


AAGAACCCGT 


GACCTTTGGG 


CAATTAACTT 


TGGTTGAAAG 


CAAAGATAAT 


3251 


AAAC CGTTAT 


CAGACAAACT 


CACATTCACG 


TTAGAAAATG 


AC CACGTTGA 


3301 


TGCAGGTGCA 


TTACGTTATA 


AATTAGTGAA 


GAATGATGGC 


GAATTCCGCT 


3351 


TACATAACCC 


AATAAAAGAG 


CAGGAATTGC 


GCTCTGATTT 


AGTAAGAGCA 


3401 


GAGCAAGCAG 


AACGAACATT 


AGAAGC CAAA 


CAAGTTGAAC 


AGACTGCTAA 


3 451 


AACACAAACA 


AGTAAGGCAA 


GAGTGCGGTC 


AAGAAGAGCG 


GTGTTTTCTG 


3501 


ATCCCCTGCC 


TGCTCAAAGC 


CTGTTAAAAG 


CATTAGAAGC 


CAAACAAGCT 


3551 


CTGACTACTG 


AAACACAAAC 


AAGTAAGGCA 


AAAAAAGTGC 


GGTCAAAAAG 


3601 


AGCTGCGAGA 


GAGTTTTCTG 


ATACCCTGCC 


TGATCAAATA 


TTACAAGCCG 


3651 


CACTTGAGGT 


TATTGATGCC 


CAACAGCAAG 


TGAAAAAAGA 


ACCTCAAACT 


3701 


CAAGAGGAAG 


AAGAGAAAAG 


ACAACGCAAA 


CAAAAAGAAT 


TGAT CAGCCG 


3751 


TTACT CAAAT 


AGTGCGTTAT 


CGGAGTTGTC 


TGCGACAGTA 


AATAGTATGC 


3801 


TTTCCGTTCA 


AGATGAATTG 


GATCGTCTTT 


TTGTAGATCA 


AGCACAATCT 


3851 


GCCGTGTGGA 


CAAATATCGC 


ACAGGATAAA 


AGACGCTATG 


ATTCTGATGC 


3901 


GTTCCGTGCT 


TATCAGCAGA 


AAACGAACTT 


GCGTCAAATT 


GGGGTGCAAA 


3951 


AAGCCTTAGA 


TAATGGACGA 


ATTGGGGCGG 


TTTTCTCGCA 


TAGCCGTTCA 


4001 


GATAATACCT 


TTGACGAACA 


GGTTAAAAAT 


CACGCGACAT 


TAGCGATGAT 


4051 


GTCGGGTTTT 


GCCCAATATC 


AATGGGGCGA 


TTTACAATTT 


GGTGTAAACG 


4101 


TGGGTGCGGG 


AATTAGTGCG 


AGTAAAATGG 


CTGAAGAACA 


AAGC CGAAAA 


4151 


ATT C AT CGAA 


AAGCGATAAA 


TTATGGTGTG 


AATGCAAGTT 


ATCAGTTCCG 


4201 


TTTAGGGCAA 


TTGGGTATTC 


AGCCTTATTT 


GGGTGTTAAT 


CGATATTTTA 



Fig. 18B 



42 51 TTGAACGTGA AAATTATCAA TCTGAAGAAG TGAAAGTGCA AACAC CGAGC 

4301 CTTGTATTTA AT CGCT ATAA TGCTGGCATT CGAGTTGATT ATACATTTAC 

4 351 CCCGACAGAT AAT AT CAGCA TTAAGCCTTA TTTCTTCGTC AATTATGTTG 

44 01 ATGTTTCAAA CGCTAACGTA CAAACCACTG TAAATCGCAC GATGTTGCAA 
4451 CAATCATTTG GGCGTTATTG GCAAAAAGAA GTGGGATTAA AGGCAGAAAT 

45 01 TTTACATTTC CAACTTTCCG CTTTTATCTC AAAATCTCAA GGTTCACAAC 
4 551 TCGGCAAACA GCAAAATGTG GGCGTGAAAT TGGGGTATCG TTGGTAAAAA 
4601 TCAAC 



Fig. 18C 



Amino acid sequence for NTHi strain TN106 Hap protein (first 
amino acid to last amino acid) : 



1 

-L 


MKKTVFRLNF 


LTACISLGIV 


SQAWAGHTYF 


GIDYQYYRDF 


AENKGKFTVG 


5 1 


AODIDIYNKK 


GEMIGTMMKG 


VPMPDLSSMV 


RGGYSTLISE 


QHLISVAHNV 


101 


n YD WD FGM E 


GENPDQHRFK 


YKWKRYNYK 


SGDRQYNDYQ 


HPRLEKFVTE 


151 


TAPIEMVSYM 


DGNHYKNFNQ 


YPLRVRVGSG 


HQWW KDDNNK 


TIGDLAYGGS 


201 


WLIGGNTFED 


GPAGNGTLEL 


NGRVQN PNKY 


GPLPTAGSFG 


DSGSPMFIYD 


251 


KEVKKWLLNG 


VLREGNPYAA 


VGNSYQITRK 


DYFQGILNQD 


I TAN FWDTNA 


301 


EYRFNIGSDH 


NGRVATIKST 


LPKKAIQPER 


IVGLYDNSQL 


HDARDKNGDE 


351 


SPSYKGPNPW 


S PALHHGKS I 


YFGDQGTGTL 


TIENNINQGA 


GGLYFEGNFV 


401 


VKGNQNNITW 


QGAGVSVGEE 


STVEWQVHNP 


EGDRLSKIGL 


GTLLVNGKGK 


451 


NLGSLSVGNG 


LWLDQQADE 


SGQKQAFKEV 


GIVSGRATVQ 


LNSADQVDPN 


501 


NIYFGFRGGR 


LDLNGHSLTF 


ERIQNTDEGA 


MIVNHNASQT 


AN I T I TGNAT 


551 


INSDSKQLTN 


KKD I AFNGWF 


GEQDKAKTNG 


RLNVNYQ PVN 


AENHLLLSGG 


601 


TNLNGNITON 


GGTLVFSGRP 


TPHAYNHLRR 


DLSNMEGIPQ 


GEIVWDHDWI 


651 


NRTFKAENFQ 


I KGGSAWSR 


NVSSIEGNWT 


VSNNANATFG 


WPNQQNTIC 


701 


TRSDWTGLTT 


CKTVDLTDKK 


VINSIPTTQI 


NGS I NLTDNA 


TVN I HGLAKli 


751 


NGNVTLIDHS 


QFTLSNNATQ 


TGNI KLSNHA 


NATVDNANLN 


GNVNLMDSAQ 


801 


FSLKNSHFSH 


QIQGGEDTTV 


MLENATWTMP 


SDTTLQNLTL 


NNSTVTLNSA 


851 


YSAI SNNAPR 


RRRRSLETET 


TPTSAEHRFN 


TLTVNGKLSG 


QGTFQFTSSL 


901 


FGYKSDKLKL 


SNDAEGDYTL 


SVRNTGKEPV 


TFGQLTLVES 


KDNKPLSDKL 


951 


TFTLENDHVD 


AGALRYKLVK 


NDGE FRLHNP 


IKEQELRSDL 


VRAEQAERTL 


1001 


EAKQVEQTAK TQTSKARVRS RRAVFSDPLP AQSLLKALEA KQALTTETQT 



1051 SKAKKVRSKR AAREFSDTLP DQILQAALEV IDAQQQVKKE PQTQEEEEKR 

1101 QRKQKELISR YSNSALSELS ATVNSMLSVQ DELDRLFVDQ AQSAVWTN I A 

1151 QDKRRYDSDA FRAYQQKTNL RQIGVQKALD NGRIGAVFSH SRSDNTFDEQ 

1201 VKNHATLAMM SGFAQYQWGD LQFGVNVGAG I SAS KMAEEQ SRKIHRKAIN 

1251 YGVNASYQFR LGQLGIQPYL GVNRYFIERE NYQSEEVKVQ TPSLVFNRYN 

13 01 AG I RVD YT FT PTDNISIKPY FFVNYVDVSN ANVQTTVNRT MLQQSFGRYW 

13 51 QKEVGLKAEI LHFQLSAFIS KSQGSQLGKQ QNVGVKLGYR W 



Fig. 19 



Nucleotide sequence for NTHi strain 8602 95 hap gene (start codon 
begins at position 430, stop codon begins at position 4738): 



1 

JL 




GTGGCGGACA 


AATTATTGCG 


ACGGGTACGC 

Xi. \JVJVJ X XX\_» V-J \» 


CAGAACAAGT 


_> X, 


X NJJ V-r \^£^r\£-\\J X £1 


GAAAGTTCCC 


ACACCGCCCG 


CTTCCTTAAA 

v» x x v«. v_*. x x nrvn 


CCGATTTTAG 


X V/ X 


AAAAAffTTA 


GAAAAAATGA 


CCGCACTTTC 


AGAGAAAACT 


CACATAAAGT 

V-*** w** X A AA AA AVmI A> 


J J X. 


OPf^TVTTATTT 

VJV-VJVJI 1AJ. X X 


TAT T AGTG AT 

x i \. x x ivvj x \jn x 


ATTGTTTTAA 

AA A. X V — i X X X X AAAA 


TTTTAGTTAT 

x x x x nvj x x aa x 


CTGTATAAAT 

V_* X KJ X AA X A AA AA A X 


201 


TACATATAAT 

X x x V — ,x \ A- _1_ aa-IA X 


ATTAATCCAT 


CGCAAGATAA 


GATTACCCAC 


TAAGTATTAA 


251 


GCAAAAACCT 


AGAAATTTTG 


GCTTAATTAC 


TATATAGTTT 


TACTGCTTTA 


301 


TTTTCTTTTG 

X X X X \_» X X X X VJ 


TGCCTTTTAG 

X V-J >— * X X X X X x\J 


TT CGTTTTTT 

A- A. V-J X X X X X X 


TAGCTGAAAT 


CCCTTAGAAA 


351 


ATCACCGCAC 


TTTTATTGTT 


CAATAGTCGT 


TTAACCACGT 


ATTTTTTAAT 


401 


ACGAAAAATT 


ACTTAATTAA 


ATAAACATTA 

A A X A Xi LTTl V-~X A X X A A 


TGAAAAAAAC 

x v*_ja aa aa uwuiriv 


TGTATTTCGT 


451 


CTGAACTTTT 


TAACCGCTTG 


C AT TT CATT A 

*AA XXX V— *AA X X A A 


GGGATAGTAT 


CGCAAGCGTG 


501 


GGCAGGT CAC 


ACTTATTTTG 

Vta* X X A A A. A X X \J 


GGATTGACTA 


CCAATATTAT 


CGTGATTTTG 


551 


CTGAGAATAA 


AGGGAAGTTT 

n v w vjamiw a. a. a. 


TCAGTTGGGG 


CTAAAAATAT 


TGAGGT TT AT 


DUX 




vj\jj/\L- 1 1 1 ACjl 


1 1 LA 




LLL l—Vj/\ JL x 


651 


TGATTTTTCT 


GTGGTGTCGC 


GAAATGGGGT 


GGCGGCATTA 


GTAGGCGATC 


701 


AGTATATTGT 


GAGTGTGGCA 


CATAACGGTG 


GATATAATAG 


CGTTGATTTT 


751 


GG AG C AG AAG 


GTCCAAATCC 


CGATCAGCAT 


CGTTTTACTT 


AT CAAATTGT 


801 


AAAAAGAAAT 


AATTATAAGC 


CAGGCAAAGA 


TAACCCTTAT 


CATGGTGACT 


851 


ATCACATGCC 


TCGTTTGCAC 


AAATTTGTCA 


CTGACGCTGA 


AC CAGCAAAG 


901 


ATGACAGACA 


ATATGAATGG 


AAAGAACTAC 


GCTGATTTAA 


GTAAATATCC 


951 


TGATCGTGTG 


CGTATTGGTA 


CAGGTGAACA 


ATGGTGGAGG 


ACTGATGAAG 



10 01 AACAAAAGCA AGGAAGTAAG AGTTCATGGC TTGCTGATGC TTATCTGTGG 

1051 AGAATAGCAG GTAACACACA TT CACAAAGT GGAGCGGGCA ACGGCACGGT 

1101 AAACTTAAGT GGAGAT AT CA CAAAACCAAA TAACTATGGA CCTCTTCCTA 

1151 CGGGTGTTTC GTTTGGAGAT AGTGGTTCTC CAATGTTTAT TTATGATGCA 

1201 ATAAAACAAA AATGGCTTAT TAATGGCGTA TTGCAAACTG GTAACCCTTT 

1251 CTCGGGAGCT GGAAATGGAT TCCAATTAAT TAGAAAAAAT TGGTTTTATG 

13 01 ATAATGTCTT TGTAGAAGAT TTGCCTATAA CATTTTTAGA GCCAAGAAGT 

13 51 AACGGTCATT ATTCATTTAC TTCAAATAAT AATGGAACTG GTACGGTTAC 

14 01 TCAAACGAAT GAAAAAGTGA GTATGCCTCA ATTTAAAGTC AGAACGGTTC 

14 51 AGTTATTTAA TGAAGCATTA AAAGAAAAAG ATAAAGAACC TGTTTATGCT 

15 01 GCAGGTGGTG TAAATGCTTA TAAACCAAGA CTAAATAATG GTAAAAATAT 
1551 TTACTTTGGC GAT CGAGGAA CAGGAACTTT AACAATTGAA AATAATATAA 

16 01 AT C AAGGTGC TGGTGGTTTG TATTTTGAGG GTAACT TTAC GGTATCTTCA 

16 51 GAAAATAATG CAACTTGGCA AGGTGCTGGA GTGCATGTAG GTGAAGACAG 

17 01 TACTGTTACT TGGAAAGTAA ACGGCGTGGA AC ATGAT CGC CTTTCTAAAA 

17 51 TTGGTAAAGG AACGTTGCAT ATTCAAGCAA AAGGTGAAAA CTTAGGCTCA 

18 01 ATTAGCGTAG GTGACGG CAA AGT CATTTTA GATCAACAAG C CGATGAGAA 
18 51 CAACCAAAAA CAAGCCTTTA AAGAAGTTGG CATTGTAAGT GGTAGAGCTA 
1901 CCGTTCAACT AAATAGTGCA GATCAAGTTG ATCCTAACAA TATTTATTTC 
1951 GGATTTCGTG GTGGTCGCTT AGATCTTAAC GGACATTCAT TAACCTTTAA 
2001 ACGTATCCAA AATACGGACG AGGGCGCGAT GATTGTGAAC CATAATACAA 



Fig. 20A 



2051 


CTCAAGTCGC 


TAATATTACT 


ATTACTGGGA 


ACGAAAGTAT 


TACTGCTCCA 


2101 


TCTAATAAAA 


ATAATATTAA 


TAAACTTGAT 


TACAGCAAAG 


AAATTGCTTA 


2151 


CAACGGTTGG 


TTTGGCGAAA 


CAGATGAAAA 


TAAACACAAT 


GGAAGATTAA 


2201 


AC CTTATTT A 


TAAACCAACC 


ACAGAAGATC 


GTACTTTGCT 


ACTTTCAGGT 


2251 


GGAACAAATT 


TAAAAGGCAA 


TATTACTCAG 


GAAGGCGGCA 


CTTTAGTGTT 


2301 


TAGTGGTCGC 


CCAACTCCAC 


ACGCTTACAA 


TCATTTAAAT 


CGCCCAAACG 


2351 


AGCTTGGGCG 


ACCTCAAGGC 


GAAGTGGTTA 


TTGATGACGA 


TTGGATCACC 


2401 


CGCACATTTA 


AAGCTGAAAA 


CTTCCAAATT 


AAAGGCGGAA 


GTGCGGTGGT 


2451 


TTCTCGCAAT 


GTTTCTTCAA 


TTGAGGGAAA 


TTGGACAGTC 


AGCAATAATG 


2501 


CAAATGCCGC 


ATTTGGTGTT 


GTGC CAAATC 


AGCAAAATAC 


CATTTGCACG 


2551 


CGTTCAGATT 


GGACAGGATT 


AACGACTTGT 


AAAACTGTGG 


ATTTAACCGA 


2601 


TACAAAAGTT 


ATT AATT C CA 


TACCGACAAC 


ACAAATTAAT 


GGCTCTATTA 


2651 


ATTTAACTGA 


TAATGCAACA 


GTGAATATTC 


ATGGTTTAGC 


AAAACTTAAT 


2701 


GGTAATGTCA 


CTTTAATAAA 


TCATAGCCAA 


TTTACATTGA 


GCAACAATGC 


2751 


CACCCAAACA 


GGCAAT AT C C 


AACTTT CAAA 


TCACGCAAAT 


GCAACGGTGG 


2801 


ACAATGCAAA 


TTTGAACGGT 


AATGTGCATT 


TAACGGATTC 


TGCTCAATTT 


2851 


TCTTTAAAAA 


ACAGCCATTT 


TTCGCACCAA 


ATT CAGGGCG 


ACAAAGACAC 


2901 


AACAGTGACG 


TTGGAAAATG 


CGACTTGGAC 


AATGCCTAGC 


GATGCCACAT 


2951 


TGCAGAATTT 


AACGCTAAAT 


AATAGTACTG 


TTACGTTAAA 


TTCAGCTTAT 


3001 


TCAGCTAGCT 


CAAATAATGC 


GCCACGTCAC 


CGC CGTT CAT 


TAGAGACGGA 


3051 


AACAACGCCA 


ACATCGGCAG 


AACATCGTTT 


CAACACATTG 


ACAGTAAATG 


3101 


GTAAATTGAG 


CGGGCAAGGC 


ACATTC CAAT 


TTACTTCATC 


TTTATTTGGC 


3151 


TATAAAAGCG 


ATAAATTAAA 


ATTATC CAAT 


GACGCTGAGG 


GCGATTACAC 


3201 


ATTATCTGTT 


CGCAACACAG 


GCAAAGAACC 


CGAAGCCCTT 


GAGCAATTAA 


3251 


CTTTGGTTGA 


AAGCAAAGAT 


AATAAAC CGT 


TATCAGACAA 


ACTCAAATTT 


3301 


ACTTTAGAAA 


ATGACCACGT 


TGATGCAGGT 


GCATTACGTT 


ATAAATTAGT 


3351 


GAAGAATAAT 


GGCGAATTCC 


GCTTGCATAA 


CCCAATAAAA 


GAGCAGGAAT 


3401 


TGCGCAATGA 


TTTAGTAAGA 


GCAGAGCAAG 


CAGAACGAAC 


ATTAGAAGCC 


3451 


AAACAAGTTG 


AACAGACTGC 


TGAAACACAA 


ACAAGTAATG 


CAAGAGTGCG 


3501 


GTCAAAAAGA 


GCGGTGTTTT 


CTGATACCCT 


GCCTGATCAA 


AGCCAGTTAG 


3551 


ACGTATTACA 


AGCCGAACAA 


GTTGAAC CGA 


CTGCTGAAAA 


ACAAAAAAAT 


3601 


AAGGCAAAAA 


AAGTGCGGTC 


AAAAAGAGCG 


GTGTTTTCTG 


ATACCCTGCC 


3651 


TGATCAAAGC 


CAGTTAGACG 


TATTACAAGC 


CGAACAAGTT 


GAAC CGACTG 


3701 


CTGAAAAACA 


AAAAAATAAG 


GCAAAAAAAG 


TGCGGTCAAA 


AAGAGCCGCG 


3751 


AGAGAGTTTT 


CTGATACCCC 


GCTTGATCTA 


AGCCGGTTAA 


AGGTATTAGA 


3801 


AGTCAAACTT 


GAGGTTATTA 


ATGCCCAACA 


GCAAGTGAAA 


AAAGAACCTC 


3851 


AAGATCAAGA 


GAAACAACGC 


AAACAAAAAG 


ACTTGATCAG 


C CGTTATT CA 


3901 


AATAGTGCGT 


TATCAGAATT 


ATCTGCAACA 


GTAAATAGTA 


TGCTTTCTGT 


3951 


TCAAGATGAA 


TTAGATCGTC 


TTTTTGTAGA 


TCAAGCACAA 


TCTGCCGTGT 


a n n i 








r\ .I \J.t\ 1 1 V_ 1 an 




4051 


GCTTATCAGC 


AGAAAACGAA 


CTTACGTCAA 


ATTGGGGTGC 


AAAAAGCCTT 


4101 


AGCTAATGGA 


CGAATTGGGG 


CAGTTTTCTC 


GCATAGCCGT 


TCAGATAATA 


4151 


CTTTTGATGA 


ACAGGTTAAA 


AATCACGCGA 


CATTAACGAT 


GATGT CGGGT 


4201 


TTTGCCCAAT 


ATCAATGGGG 


CGATTTACAA 


TTTGGTGTAA 


ACGTGGGAAC 
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4251 


GGGAAT C AGT 


GCGAGTAAAA 


TGGCTGAAGA 


ACAAAGCCGA 


AAAATTCATC 


4301 


GAAAAGCGAT 


AAATTATGGC 


GTGAATGCAA 


GTT AT CAGTT 


CCGTTTAGGG 


4351 


CAATTGGGCA 


TTCAGCCTTA 


TTTTGGAGTT 


AATCGCTATT 


TTATTGAACG 


4401 


TGAAAATTAT 


CAATCTGAGG 


AAGTGAAAGT 


GAAAACGCCT 


AGC CTTGCAT 


4451 


TTAATCGCTA 


TAATGCTGGC 


ATT CGAGTTG 


ATTATACATT 


TACTCCGACA 


4501 


GATAATATCA 


GCGTTAAGCC 


TTATTTCTTC 


GTCAATTATG 


TTGATGTTTC 


4551 


AAACGCTAAC 


GTACAAACCA 


CGGTAAATAG 


CACGGTGTTG 


CAACAACCAT 


4601 


TTGGACGTTA 


TTGGCAAAAA 


GAAGTGGGAT 


TAAAAGCGGA 


AATTTTACAT 


4651 


TTCCAACTTT 


CTGCTTTTAT 


TTCTAAATCT 


CAAGGTTCGC 


AACTCGGCAA 


4701 


ACAGCAAAAT 


GTGGGCGTGA 


AATTGGGGTA 


TCGTTGGTAA 


AAATCAACAT 


4751 


AATTGT AT CG 


TTTATTGATA 


AACAAGGTGG 


GGCAGATCCC 


ACCTTTTTTA 


4801 


TTTCAATAAT 


GGAACTTTAT 


TTAATTAAGA 


GCAT CTAAGT 


AGCACCCCAT 


4851 


ATAGGGGATT 


AATTAAGAGG 


ATTTAATAAT 


GAATTTAACT 


AAACTTTTAC 


4901 


CAGCATTTGC 


TGCTGCAGTC 


GTATTATCTG 


CTTGTGCAAA 


GGATGCACCT 


4951 


GAAATGACAA 


AATCATCTGC 


GCAAATAGCT 


GAAATGCAAA 


CACTTCCAAC 


5001 


AATCACTGAT 


AAAACAGTTG 


TATATTC CTG 


CAATAAACAA 


ACTGTAACTG 


5051 


C CGTGTAT C A 


ATTTGAAAAC 


CAAGAACCAG 


TTGCTGCAAT 


GGTAAGTGTG 


5101 


GGCGATGGCA 


TTATTGCGAA 


AGATTTTACT 


CGTGATAAAT 


CACAAAATGA 


5151 


CTTTACAAGT 


TTCGTTTCTG 


GGGATTATGT 


TTGGAATGTA 


GATAGTGGCT 


5201 


TAACGTTAGA 


TAAATTTGAT 


TCTGTTGTGC 


CTGTCAATTT 


AATTC 
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Amino acid sequence for NTHi strain 860295 Hap protein (first 
amino acid to last amino acid) : 



1 

JL 




LTACISLGIV 


SQAWAGHTYF 


GIDYQYYRDF 


AENKGKFSVG 


51 


AKN I E VYNKE 


GTLVGTSMTK 


APMIDFSWS 


RNGVAALVGD 


QYIVSVAHNG 


101 


GYNSVD FGAE 


GPNPDQHRFT 


YQIVKRNNYK 


PGKDNPYHGD 


YHMPRLHKFV 


151 

J 1 -L. 


TDAE PAKMTD 

JL 1— /JT^JLj JL JTi-i. VI 1 JL JU/ 


NMNGKNYADL 


SKYPDRVRIG 


TGEQWWRTDE 


EQKQGSKSSW 


201 


T.ADAYT.WRI A 

l i f-A I JM JL JLJ ¥ « IX. J- JTi. 


GNTHSOSGAG 


NGTVNLSGDI 


TKPNNYGPLP 


TGVS FGDSGS 


251 


PMFIYDAIKO 

JL 1 1JL -JL X L/Xii 


KWL I NGVXiOT 


GNPFSGAGNG 


FQLIRKNWFY 


DWFVEDLPI 


301 

-J \J -L 


TFLE PRSNGH 

X XT JLJJLJ XT i\ l_> IN VJJ. 1 


YSFTSNNNGT 


GTVTQTNEKV 


SMPQFKVRTV 


QLFNEALKEK 


351 


DKE P VYAAGG 


VNAYKPRLNN 


GKNIYFGDRG 


TGTLTIENNI 


NQGAGGLYFE 


401 


GN FTVS S ENN 


ATWQGAGVHV 


GEDSTVTWKV 


NGVEHDRLSK 


IGKGTLHIQA 


451 


KGENLGS I S V 


GDGKVI LDQQ 


ADENNQKQAF 


KEVGIVSGRA 


TVQLNSADQV 


501 


DPNNIYFGFR 


GGRLDLNGHS 


LTFKRIQNTD 


E GAM I VNHNT 


TQVANITITG 


551 


NESITAPSNK 


NNINKLDYSK 


E I AYNGWFGE 


TDENKHNGRL 


NLIYKPTTED 


601 


RTLLLSGGTN 

IV JL JLJ JLJ JLJ VJVJ JL 


LKGNITOEGG 


TLVFSGRPTP 


HAYNHLNRPN 


ELGRPQGEW 




IDDDWITRTF 


KAENFQIKGG 


SAWSRNVSS 


I EGNWTVSNN 


ANAAFGWPN 


701 


QQNTICTRSD 


WTGLTTCKTV 


DLTDTKVINS 


IPTTQINGSI 


NLTDNATVN I 


751 


HGLAKLNGNV 


TLINHSQFTL 


SNNATQTGNI 


QLSNHANATV 


DNANLNGNVH 


801 


LTDSAQFSLK 


NSHFSHQIQG 


DKDTTVTLEN 


ATWTM P S DAT 


LQNLTLNNST 


851 


VTLNSAYSAS 


SNNAPRHRRS 


LETETTPTSA 


EHRFNTLTVN 


GKLSGQGTFQ 


901 


FTSSLFGYKS 


DKLKLSNDAE 


GDYTLSVRNT 


GKE PEALEQL 


TLVESKDNKP 


951 


LSDKLKFTLE 


NDHVDAGALR 


YKLVKNNGEF 


RLHNPIKEQE 


LRNDLVRAEQ 


1001 


AERTLEAKQV EQTAETQTSN ARVRSKRAVF SDTLPDQSQL DVLQAEQVEP 



1051 TAEKQKNKAK KVRSKRAVFS DTLPDQSQLD VLQAEQVEPT AEKQKNKAKK 

1101 VRSKRAAREF SDTPLDLSRL KVLEVKLEVI NAQQQVKKEP QDQEKQRKQK 

1151 DLISRYSNSA LSELSATVNS MLSVQDELDR LFVDQAQSAV WTNIAQDKRR 

1201 YD S DAFRAYQ QKTNLRQIGV QKALANGR I G AVFSHSRSDN TFDEQVKNHA 

1251 TLTMMSGFAQ YQWGDLQFGV NVGTGISASK MAEEQSRKIH RKAINYGVNA 

1301 SYQFRLGQLG IQPYFGVNRY FIERENYQSE EVKVKTPSIxA FNRYNAGIRV 

1351 DYTFTPTDNI SVKPYFFVNY VDVSNANVQT TVNSTVLQQP FGRYWQKEVG 

14 01 LKAEILHFQL SAFISKSQGS QLGKQQNVGV KLGYRW 
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Nucleotide sequence for NTHi strain 3219B hap gene (start codon 
begins at position 388, stop codon begins at position 4561): 



1 


CCTGAAGACG 


TTGCTCAAGT 


TAAAGGCTCT 


CACACAGCCC 


GATTCCTTAA 


51 


ACCGATTTTA 


GAAAAACCTT 


AGAAAAAATG 


ACCGCACTTT 


CAGAGAAAAC 


101 


TCACATAAAG 


TGCGGTTATT 


TTATTAGTGA 


TATTGTTTTA 


ATTATTTGTA 


151 


TAAATTACAT 


ACAATATTAA 


TCCATCGAAA 


AATAAGATTA 


CCCACTAAGT 


201 


ATTAAGCCAA 


AACCTAGAAA 


TTTTGGCTTA 


ATTACTATAT 


AATTTTACTC 


251 


CTTTATTTTC 


TTTTGTGCCT 


TTTAGTTAGT 


TCGTTTTTTA 


GCTGAAATCC 


301 


CTCAGAAAAT 


CACCGCACTT 


TTATTGTTCA 


ATAGTCGTTT 


AACCACGTAT 


351 


TTTTTAATAC 


GAAAAATTAC 


TTAATTAAAT 


AAACATTATG 


AAAAAAACTG 


401 


TATTTCGTCT 


TAATTTTCTA 


ACCGCTTGTA 


TTTCATTAGG 


GATAGTATCG 


451 


CAAGCGTGGG 


CAGGTCACAC 


TTATTTTGGG 


ATTGACTACC 


AATATTATCG 


501 


TGATTTTGCC 


GAGAATAAAG 


GGAAGTTTAC 


AGTTGGGGCT 


CAAGATATTG 


551 


ATATCTACAA 


TAAAAAAGGG 


GAAATGATAG 


GTACGATGAT 


GAAAGGTGTG 


601 


CCTATGCCTG 


ATTTATCTTC 


CATGGTTCGT 


GGTGGTTATT 


CAACATTGAT 


651 


AAGTGAGCAG 


CATTTAATTA 


GCGTCGCACA 


TAATGTAGGG 


TATGATGTCG 


701 


TTGATTTTGG 


TATGGAGGGG 


GAAAATCCAG 


ACCAACATCG 


TTTTAAGTAT 


751 


AAAGTTGTTA 


AACGATATAA 


TTATAAGAGC 


GGTGATAGAC 


AATATAATGA 


801 


TTATCAACAT 


C C AAGATTAG 


AGAAATTTGT 


AACGGAAACT 


GCACCTATTG 


851 


AAATGGTTTC 


ATATATGGAT 


GGTAATCATT 


ACAAAAATTT 


TAATCAATAT 


901 


CCTTTGCGAG 


TTAGAGTTGG 


AAGTGGGCAT 


CAATGGTGGA 


AAGACGATAA 


951 


TAATAAAACC 


ATTGGAGACT 


TAGCCTATGG 


AGGTTCATGG 


TTAATAGGTG 



1001 GAAAT AC CTT TGAAGATGGA CCAGCTGGTA ACGGTACATT AGAATTAAAT 

1051 GGGCGAGTAC AAAATCCTAA TAAATATGGT CCACTACCTA CGGCAGGTTC 

1101 ATT CGGGGAT AGTGGTTCTC CAATGTTTAT TTATGATAAG GAAGTTAAGA 

1151 AATGGTTATT AAATGGCGTG TTACGTGAAG GAAAT C CTTA TGCTGCAGTA 

1201 GGAAACAGCT ATCAAATTAC ACGAAAAGAT TATTTT CAAG GTATTCTTAA 

12 51 TCAAGACATT ACAGCTAATT TTTGGGATAC TAATGCTGAA TATAGATTTA 

13 01 ATATAGGGAG TGAC CACAAT GGAAGAGTGG CAACAATCAA AAGTACATTA 

13 51 CCTAAAAAAG CTATTCAGCC TGAACGAATA GTGGGTCTTT ATGATAATAG 

14 01 CCAACTTCAT GATGCTAGAG ATAAAAATGG CGATGAATCT CCCTCTTATA 
1451 AAGGTCCTAA TCCATGGTCG C C AGCATT AC AT CATGGGAA AAGTATTTAC 
1501 TTTGGCGATC AAGGAACAGG AACTTTAACA ATTGAAAATA ATATAAATCA 
1551 AGGTGCAGGT GGATTGTATT TTGAAGGTAA TTTTGTTGTA AAAGGCAATC 
1601 AAAATAATAT AACTTGGCAA GGTGCAGGCG TTTCTGTTGG AGAAGAAAGT 
1651 ACTGTTGAAT GGCAGGTGCA TAATC CAGAA GGCGATCGCT TAT C CAAAAT 

17 01 TGGGCTGGGA ACCTTACTTG TTAATGGTAA AGGGAAAAAC TTAGGAAGCC 
1751 TGAGTGTCGG TAACGGTTTG GTTGTGTTAG ATCAACAAGC AGATGAAT C A 

18 01 GGTCAAAAAC AAGCCTTTAA AGAAGTTGGC ATTGTAAGTG GTAGAGCTAC 
1851 CGTTCAACTA AATAGTGCAG ATCAAGTTGA TCCTAACAAT ATTTATTTCG 
1901 GCTTTCGTGG TGGTCGCTTA GATCTTAATG GGCATTCATT AACCTTTGAA 
1951 CGTATCCAAA ATACGGATGA AGGCGCGATG ATTGTGAAC C ACAACGCTTC 
2001 TCAAACCGCA AATATTACGA TTACAGGCAA CGCAACTATT AATTCAGATA 
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2051 GCAAACAACT TACTAATAAA AAAGATATTG CATTTAACGG CTGGTTTGGT 
2101 GAGCAAGATA AAGCTAAAAC AAATGGT CGT TTAAATGTGA ATTATCAACC 
2151 AGTTAATGCA GAAAAT CATT TGTTGCTTTC TGGGGGGACA AATTTAAACG 
2201 GCAATATCAC GCAAAATGGT GGTACGTTAG TTTTTAGTGG TCGTCCAACG 
2251 CCTCATGCTT ACAATCATTT AAGAAGAGAC TTGTCTAACA TGGAAGGTAT 
2301 CCCACAAGGC GAAATTGTGT GGGATCACGA TTGGATCAAC CGCACATTTA 
2351 AAGCTGAAAA CTTCCAAATT AAAGGCGGAA GTGCGGTGGT TTCTCGCAAT 
2401 GTTTCTTCAA TTGAGGGAAA TTGGACAGTC AGCAATAATG CAAATGCCAC 
2451 ATTTGGTGTT GTGCCAAATC AGCAAAATAC CATTTGCACG CGTTCAGATT 
2501 GGACAGGATT AACGACTTGT AAAACAGTTG ATTTAACCGA TAAAAAAGTT 
2 551 ATTAATTCCA TACCGACAAC ACAAATTAAT GGTTCTATTA ATTTAACTGA 
2601 TAATGCAACA GTGAATATTC ATGGTTTAGC AAAACTTAAT GGTAATGTCA 
2651 CTTTAATAGA TCACAGCCAA TTTACATTGA GCAACAATGC CACCCAAGCA 
2701 GGCAATATCA AACTTTCAAA TCACGCAAAT GCAACGGTGG ACAATGCAAA 
2751 TTTGAACGGT AATGTGAATT TAATGGATTC TGCTCAATTT TCTTTAAAAA 
2801 ACAGCCATTT TTCGCACCAA ATCCAAGGTG GGGAAGACAC AACAGTGATG 
2851 TTGGAAAATG CGACTTGGAC AATGCCTAGC GATACCACAT TGCAGAATTT 
2901 AACGCTAAAT AATAGTACTG TTACGTTAAA TTCAGCTTAT TCAGCTATCT 

2 951 CAAATAATGC GCCACGCCGT CGCCGCCGTT CATTAGAGAC GGAAACAACG 
3001 CCAACATCGG CAGAACATCG TTTCAACACA TTGACAGTAA ATGGTAAATT 
3051 GAGCGGGCAA GGCACATTCC AATTTACTTC ATCTTTATTT GGCTATAAAA 
3101 GCGATAAATT AAAATTATCC AATGACGCTG AGGGCGATTA CACATTATCT 
3151 GTTCGCAACA CAGGCAAAGA ACCCGTGACC TTTGGGCAAT TAACTTTGGT 
3201 TGAAAGCAAA GATAATAAAC CGTTATCAGA CAAACTCACA TTCACGTTAG 
3251 AAAATGACCA CGTTGATGCA GGTGCATTAC GTTATAAATT AGTGAAGAAT 
3301 GATGGCGAAT TCCGCTTACA TAACCCAATA AAAGAGCAGG AATTGCGCTC 
3351 TGATTTAGTA AGAGCAGAGC AAGCAGAACG AACATTAGAA GCCAAACAAG 
3401 TTGAACAGAC TGCTAAAACA CAAACAAGTA AGGCAAGAGT GCGGTCAAGA 
3451 AGAGCGGTGT TTTCTGATCC CCTGCCTGCT CAAAGCCTGT TAAACGCATT 
3501 AGAAGCCAAA CAAGCTCTGA CTACTGAAAC ACAAACAAGT AAGGCAAAAA 

3 551 AAGTGCGGTC AAAAAGAGCT GCGAGAGAGT TTTCTGATAC CCTGCCTGAT 
3601 CAAATATTAC AAGCCGCACT TGAGGTTATT GATGCCCAAC AGCAAGTGAA 
3651 AAAAGAACCT CAAACTCAAG AGGAAGAAGA GAAAAGACAA CGCAAACAAA 
3701 AAGAATTGAT CAGCCGTTAC TCAAATAGTG CGTTATCGGA GTTGTCTGCG 
3751 ACAGTAAATA GTATGCTTTC CGTTCAAGAT GAATTGGATC GTCTTTTTGT 
3801 AGATCAAGCA CAATCTGCCG TGTGGACAAA TATCGCACAG GATAAAAGAC 
3851 GCTATGATTC TGATGCGTTC CGTGCTTATC AGCAGAAAAC GAACTTGCGT 
3 901 CAAATTGGGG TGCAAAAAGC CTTAGATAAT GGACGAATTG GGGCGGTTTT 
3 951 CTCGCATAGC CGTTCAGATA ATACCTTTGA CGAACAGGTT AAAAATCACG 
4001 CGACATTAGC GATGATGTCT GGTTTTGCCC AATATCAATG GGGCGATTTA 
40 51 CAATTTGGTG TAAACGTGGG TGCGGGAATT AGTGCGAGTA AAATGGCTGA 
4101 AGAACAAAGC CGAAAAATTC ATCGAAAAGC GATAAATTAT GGTGTGAATG 
4151 CAAGTTATCA GTTCCGTTTA GGGCAATTGG GTATTCAGCC TTATTTGGGT 
42 01 GTTAATCGAT ATTTTATTGA ACGTGAAAAT TATCAATCTG AAGAAGTGAA 
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yl OCT 




PPGAGCCTTG 


TATTTAAT CG 


CTATAATGCT 


GGCATT CGAG 


a ~\ n i 

T± O u X 




ATTTACCCCG 


ACAGATAATA 


TCAGCATTAA 


GCCTTATTTC 


*± J .J X 


TTPCiTPAATT 


ATGTTGATGT 


TTCAAACGCT 


AACGTACAAA 


CCACTGTAAA 


a a m 

*± *± L/ X 


X v^vjv— i»V \J^i. A. \J 


TTGCAACAAT 


CATTTGGGCG 


TTATTGGCAA 


AAAGAAGTGG 


ft ft 3 X 


HATTAAAGfir 

X X rtririvjvjv-. 


AGAAAT TT T A 


CATTTCCAAC 


TTTCCGCTTT 


TAT CTCAAAA 




TPTPAAGGTT 


CACAACTCGG 


CAAACAGCAA 


AATGTGGGCG 


TGAAATTGGG 


*± 3 .J X 


nTATPHTTnO 

vjirii vox xvjvj 


TAAAAATCAA 


CATAATTTTA 


TCGTTTATTG 


ATAAACAAGG 


4 6 01 






PTTTTTT ATT 
V_ X X X X X lnX X 


PPAATAATfiG 


AAPTTTATTT 

/Irtv — , XXX il XXX 


4651 


TATTAAAGGT 


ATCTAAGTAG 


CACCCTATAT 


AGGGATTAAT 


TAAGAGGATT 


4701 


TAATAATGAA 


TTTAACTAAA 


ATTTTACCCA 


CATTTGCTGC 


TGTAGTCGTA 


4751 


TTATCTGCTT 


GTGCAAAGGA 


TGCACCTGAA 


ATGACAAAAT 


CATCTGCGCA 


4801 


AATAGCTGAA 


ATGCAAACAC 


TT 







Fig. 22C 



Amino acid sequence for NTHi strain 3219B Hap protein (first 
amino acid to last amino acid) : 



1 


M KKTVFRLNF 


LTACISLGIV 


SQAWAGHTYF 


GIDYQYYRDF 


AENKGKFTVG 


51 


AQDIDIYNKK 


GEMIGTMMKG 


VPMPDLSSMV 


RGGYSTLISE 


QHL I SVAHNV 


101 


G YD WD FGM E 


GENPDQHRFK 


YKWKRYNYK 


SGDRQYNDYQ 


HPRLEKFVTE 


151 


TAPIEMVSYM 


DGNHYKNFNQ 


YPLRVRVGSG 


HQWWKDDNNK 


T I GDLAYGG S 


201 


WLIGGNTFED 


GPAGNGTLEL 


NGRVQNPNKY 


GPLPTAGSFG 


DSGSPMFIYD 


251 


KEVKKWLLNG 


VLREGNPYAA 


VGNSYQITRK 


DYFQGILNQD 


I TANFWDTNA 


301 


EYRFNIGSDH 


NGRVATIKST 


LPKKAIQPER 


IVGLYDNSQL 


HDARDKNGDE 


351 


SPSYKGPNPW 


S PALHHGKS I 


YFGDQGTGTL 


TIENNINQGA 


GGLYFEGNFV 


401 


VKGNQNN I TW 


QGAGVSVGEE 


STVEWQVHNP 


EGDRLSKIGL 


GTLLVNGKGK 


451 


NLGS LSVGNG 


LWLDQQADE 


SGQKQAFKEV 


GIVSGRATVQ 


LNSADQVDPN 


501 


NIYFGFRGGR 


LDLNGHS LTF 


ER I QNTDEGA 


MIVNHNASQT 


AN I T I TGNAT 


551 


INSDSKQLTN 


KKD I AFNGWF 


GEQDKAKTNG 


RLNVNYQPVN 


AENHLLLSGG 


601 


TNLNGN I TQN 


GGTLVFSGRP 


TPHAYNHLRR 


DLSNMEGIPQ 


GEIVWDHDWI 


651 


NRTFKAENFQ 


I KGGSAWSR 


NVSSIEGNWT 


VSNNANATFG 


WPNQQNTIC 


701 


TRSDWTGLTT 


CKTVDLTDKK 


VINSIPTTQI 


NGS I NLTDNA 


TVNIHGLAKL 


751 


NGNVTL I DHS 


QFTLSNNATQ 


AGNIKLSNHA 


NATVDNANLN 


GNVNLMDSAQ 


801 


FSLKNSHFSH 


QIQGGEDTTV 


MLENATWTMP 


SDTTLQNLTL 


NNSTVTLNSA 


851 


YSAI SNNAPR 


RRRRSLETET 


TPTSAEHRFN 


TLTVNGKLSG 


QGTFQFTSSL 


901 


FGYKSDKLKL 


SNDAEGDYTL 


SVRNTGKEPV 


TFGQLTLVES 


KDNKPLSDKL 


951 


TFTLENDHVD 


AGALRYKLVK 


NDGEFRLHNP 


IKEQELRSDL 


VRAEQAERTL 


1001 


EAKQVEQTAK TQTSKARVRS RRAVFSDPLP AQSLLNALEA KQALTTETQT 



1051 SKAKKVRSKR AAREFSDTLP DQILQAALEV IDAQQQVKKE PQTQEEEEKR 

1101 QRKQKELISR YSNSALSELS ATVNSMLSVQ DELDRLFVDQ AQSAVWTNIA 

1151 QDKRRYDSDA FRAYQQKTNL RQIGVQKALD NGRIGAVFSH SRSDNTFDEQ 

12 01 VKNHATLAMM SGFAQYQWGD LQFGVNVGAG ISASKMAEEQ SRKIHRKAIN 
1251 YGVNASYQFR LGQLGIQPYL GVNRYF I ERE NYQSEEVKVQ TPS LVFNR YN 
1301 AGIRVDYTFT PTDNISIKPY FFVNYVDVSN ANVQTTVNRT MLQQSFGRYW 

13 51 QKEVGLKAE I LHFQLSAFIS KSQGSQLGKQ QNVGVKLGYR W 
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Nucleotide sequence for NTHi strain 13 96B hap gene (start codon 
begins at position 313, stop codon begins at position 4546) : 



1 

X 


TH A PCGC ACT 


TTCAGAGAAA 


ACT CAC AT AA 


AGTGCGGTTA 


TTTTATTAGT 




nZkTATTGTTT 

Vjrt X li X X VJ XXX 


TAATTTTAGT 


TATCTGTATA 


AATTACATAC 


AATATTAATC 


i n i 

X U X 




TAAGATTACC 


CACTAAGTAT 


TAAGCAAAAA 


CCTAGAAATT 


1 CI 

J. Zj -L 


TTGHfTTAAT 


TACTATATAG 


TTTTACTCAT 


TTATTTTCTT 


TTGTGCCTTT 


KJ -L 


TAGTTCGTTT 

X ii\J X X VVJ X J- A- 


TTTTAGCTGA 


AATCCCTTAG 


AAAATCACCG 


CACTTTTATT 


251 


GTT C AATAGT 


CGTTTAACCA 


CGTATTTTTT 


AATACGAAAA 


ATTACTTAAT 


3 01 

J V J- 


TAAATAAACA 


TTATGAAAAA 


AACTGTATTT 


CGTCTGAATT 


TTTTAACCGC 


3 51 


TTGCATTTCA 


TTAGGGATAG 


TATCGCAAGC 


GTGGGCAGGT 


CATACTTATT 


4 01 


TTGGGATTGA 


CTACCAATAT 


TAT CGTGATT 


TTGC CGAGAA 


TAAAGGGAAG 


451 


TTCACAGTTG 


GGGCTAAAAA 


TATTGAGGTT 


TACAATAAAA 


ATGGAAATTT 


\J J. 


AGTTGGCACA 


TCAATGACAA 


AAGCCCCAAT 


GATTGATTTT 


TCCGTGGTGT 


— ) —> -L 


CGCGAAATGG 


GGTGGCGGCA 


TTGGTGGGCG 


ATCAGTATAT 


TGTGAGTGTG 


0 1 

D U J. 


GPAPATAATG 


TAGGCTATAC 


CAATGTGGAT 


TTTGGTGCTG 


AAGGACAAAA 


651 


TCCTGATCAA 


CATCGTTTTA 


CTTATAAAAT 


TGTGAAACGG 


AATAATTATA 


701 


AAAACGATCA 


AACGCATCCT 


TATGAGAAAG 


ACTACCACAA 


CCCACGCTTA 


751 


CATAAATTTG 


TTACGGAAGC 


CACCCCAATC 


GATATGACTT 


CTGATATGAA 


801 


CGGCAACAAA 


TATACAGATA 


GGACGAAATA 


TCCCGAACGC 


GTGCGTATCG 


851 


GCTCCGGGTG 


GCAGTTTTGG 


CGAAACGATC 


AAAACAACGG 


CGAC CAAGTT 


901 


GCCGGCGCAT 


AT CATTACCT 


GACAGCAGGC 


AATACACACA 


ACCAAGGCGG 


951 


AGCAGGGGGC 


GGCTGGTCAA 


GTCTGAGCGG 


CGATGTGCGC 


CAAGCGGGCA 



1001 ATTACGGCCC CATTCCTATT GCAGGCTCAA GCGGCGACAG CGGTTCGCCT 

1051 ATGTTTATTT ATGATGCGGA AAAACAAAAA TGGTTGATTA ACGGCGTATT 

1101 GAGGACCGGC AACCCTTGGG CGGGGACAGA GAATACATTC CAACTGGTAC 

1151 GCAAGTCTTT TTTTGATGAA ATCCTTGAAA AAGATTTGCG TACATCGTTT 

12 01 TATAGCCCAT CGGGCAATGG TGCAT AC AC C ATTACAGACA AAGGCGACGG 
1251 CAGCGGCATT GTCAAACAAC AAACAGGAAG AC CAT CTGAA GTCCGCATCG 

13 01 GTTTAAAAGA CGACAAATTA CCTGCCGAAG GTAAAGACGA TGTTTACCAA 

13 51 TACCAAGGTC CAAATATATA CCTGCCTCGT TTGAATAACG GTGGAAAC CT 

14 01 GTATTTCGGA GATCAAAAAA ACGGCACTGT TACCTTATCA ACCAACATCA 

14 51 ACCAAGGTGC GGGCGGTTTG TATTTTGAGG GTAACTTTAC GGTATCTTCA 

15 01 GAAAATAATG CAACTTGGCA AGGTGCTGGA GTGCATGTAG GTGAAGACAG 
1551 TACTGTTACT TGGAAAGTAA ATGGTGTTGA AAATGAT CGC CTTTCTAAAA 
1601 TCGGCAAAGG CACATTGCAC GTTAAAGCCA AAGGGGAAAA TAAAGGTTCG 
1651 AT CAGCGT AG GCGATGGTAA AGTCATTTTG GAGCAGCAGG CAGACGATCA 
1701 AGGCAACAAA CAAGCCTTTA GTGAAATTGG CTTGGTTAGT GGCAGAGGTA 
1751 CGGTT C AGTT AAACGATGAC AAGCAATTTA ATACTGATAA ATTTTATTTC 
18 01 GGCTTCCGTG GTGGTCGCTT AGAT CTTAAT GGG C ATT CAT TAACCTTTAA 
1851 ACGTATCCAA AATACGGATG AGGGAGCAAC GATTGTTAAT CACAATGCCA 
1901 CAACAGAATC T AC AGTGAC C ATTACTGGCA GCGATACCAT TAATGACAAC 
1951 ACTGGCGATT TAACCAATAA ACGTGATATT GCTTTTAATG GTTGGTTTGG 
2 001 TGATAAAGAT GATACTAAAA ATACTGGACG TTTGAATGTT ACTTACAATC 
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GTGGGAACGG 
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GTGAAAGTGC 
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Amino acid sequence for NTHi strain 1396B Hap protein (first 
amino acid to last amino acid) : 



1 MKKTVFRLNF LTACISLGIV SQAWAGHTYF GIDYQYYRDF AENKGKFTVG 

51 AKNI EVYNKN GNLVGTSMTK APMIDFSWS RNGVAALVGD QYIVSVAHNV 

101 GYTNVDFGAE GQNPDQHRFT YKIVKRNNYK NDQTHPYEKD YHNPRLHKFV 

151 TEAT PI DMT S DMNGNKYTDR TKYPERVRIG SGWQFWRNDQ NNGDQVAGAY 

2 01 HYLTAGNTHN QGGAGGGWSS LSGDVRQAGN YGPIPIAGSS GDSGSPMFIY 
251 DAEKQKWLIN GVLRTGNPWA GTENTFQLVR KSFFDEILEK DLRTSFYSPS 

3 01 GNGAYTITDK GDGSGIVKQQ TGRPSEVRIG LKDDKLPAEG KDDVYQYQGP 
3 51 NIYLPRLNNG GNLYFGDQKN GTVTLSTNIN QGAGGLYFEG NFTVSSENNA 
401 TWQGAGVHVG EDSTVTWKVN GVENDRLSKI GKGTLHVKAK GENKGS I SVG 
451 DGKVILEQQA DDQGNKQAFS EIGLVSGRGT VQLNDDKQFN TDKFYFGFRG 
501 GRLDLNGHSL TFKRIQNTDE GAT I VNHNAT TESTVTITGS DT I NDNTGDL 
551 TNKRDIAFNG WFGDKDDTKN TGRLNVTYNP LNKDNHFLLS GGTNLKGNIT 
601 QDGGTLVFSG RPTPHAYNHL NRLNELGRPK GEWIDDDWI NRTFKAENFQ 
651 IKGGSTWSR NVSSIEGNWT ISNNANATFG WPNQQNTIC TRSDWTGLTT 
701 CKTVNLTDKK VIDSIPTTQI NGSINLTNNA TVN I HGLAKL NGNVTLINHS 
751 QFTLSNNATQ TGNIQLSNHA NATVDNANLN GNVHLTDSAQ FSLKNSHFSH 
801 QIQGDKDTTV TLENATWTMP SDTTLQNLTL NNSTVTLNSA YS AS SNNAPR 
851 HRRSLETETT PTSEEHRFNT LTVNGKL SGQ GTFQFTSSLF GYKSDKIKLS 
901 NDAEGDYTLA VRDTGKE PVT LEQLTLIEGL DNQPLPDKLK ITLKNKHVDA 
951 GAWRYELVKK NGE FRLHNP I KEQELRNDLV KAEQVERALE AKQAELTTKK 
1001 QKTEAKVRSK RAAFSDTPPD QSQLNALQAE LETINAQQQV AQAVQNQKVT 
1051 ALNQKNEQVK TTQD KANLVL ATALVEKETA QIDFANAKLA QLNLTQQLEK 
1101 ALAVAEQAE K ERKAQEQAKR QRKQKDLISR YSNSALSELS ATVNSMLSVQ 
1151 DELDRLFVDQ AQSAVWTNIS QDKRRYDSDA FRAYQQKTNL RQIGVQKALA 
1201 NGRIGAVFSH SRSDNTFDEQ VKNHATLTMM SGFAQYQWGD LQFGVNVGTG 
1251 ISASKMAEEQ SRKIHRKAIN YGVNASYSFH LGQLGIQPYF GVNRYFIERK 
1301 NYQSEEVKVQ TPSLAFNRYN AG VRVD YT FT PTENISVKPY FFVNYVDVSN 

13 51 ANVQTTVNRA VLQQPFGRYW QKEVGLKAE I LHFQLSAFIS KSQGSQLGKQ 

14 01 RNMGVKLGYR W 
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